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Preface 



This manual describes the TOPS-10/TOPS-20 Common Math Library. At 
present, the library is included as part of each object-time system of each 
language that uses it. In the future, the library will be a separate entity as 
described in this manual. Chapter 1 introduces the library routines and gives 
information on how they are described. A table of the routines, arranged in 
alphabetical order, is included for easy reference. Chapters 2 through 15 con- 
tain the descriptions of the routines, grouped logically such that all like 
routines are together (e.g., all the square root routines are in Chapter 2). 
Appendix A gives the results of the ELEFUNT tests and Appendix B de- 
scribes error handling for MACRO programs. 
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Chapter 1 
Introduction 



1.1 The Math Library 



The TOPS-10/TOPS-20 Common Math Library contains a set of routines 
that perform the following mathematical functions for several types of data. 

square root 

natural and base- 10 logarithm 

exponential and exponentiation 

trigonometric 

inverse trigonometric 

hyperbolic 

random number generation 

absolute value 

data type conversion 

rounding and truncation 

product 

remainder 

positive difference 

transfer of sign 

maximum or minimum of a series 

complex conjugate 

complex multiplication or division 



Most of the routines are functions; but some, notably the complex double- 
precision, are subroutines. The difference between the types of routines is the 
way in which they are called from a program. Consult the applicable language 
manual for more information. 

The routines are listed alphabetically in Table 1-1 with a short description of 
each and a page reference. 
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Table 1-1: Math Library Routines 



Routine Name 


Page 


ABS 


9-4 


ACOS 


6-4 


AIMAG 


15-4 


AINT 


11-9 


ALOG 


3-3 


ALOG10 


3-5 


AMAXO 


14-5 


AMAX1 


14-6 


AMINO 


14-11 


AMIN1 


14-12 


AMOD 


12-6 


ANINT 


11-6 


ASIN 


6-3 


ATAN 


6-13 


ATAN2 


6-15 


CABS 


9-7 


CCOS 


5-21 


CDABS 


9-8 


CDCOS 


5-25 


CDEXP 


4-11 


CDLOG 


3-17 


CDSIN 


5-23 


CDSQRT 


2-11 


CEXP 


4-9 


CEXP2. 


4-22 



CEXP3. 



4-34 



CFDV 


15-7 


CFM 


15-6 


CGABS 


9-9 


CGCOS 


5-29 



Purpose 

absolute value 

arc cosine 

imaginary part of complex number 

truncation to integer 

natural logarithm 

base- 10 logarithm 

largest of a series 

largest of a series 

smallest of a series 

smallest of a series 

remainder 

nearest whole number 

arc sine 

arc tangent 

polar angle of a point in the x-y plane 

complex absolute value 

complex cosine 

complex, double-precision, D-floating-point absolute value 

complex, double-precision, D-floating-point cosine 

complex, double-precision, D-floating-point exponential 

complex, double-precision, D-floating-point natural 
logarithm 

complex, double-precision, D-floating-point sine 

complex, double-precision, D-floating-point square root 

complex exponential 

exponentiation of a complex number to the power of an 
integer 

exponentiation of a complex number to the power of 
another complex number 

complex division 

complex multiplication 

complex, double-precision, G-floating-point absolute value 

complex, double-precision, G-floating-point cosine 



(continued on next page) 
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Table Table 1-1 (Cont.): Math Library Routines 
Routine Name Page Purpose 



CGEXP 


4-13 


CGLOG 


3-19 


CGSIN 


5-27 


CGSQRT 


2-13 


CLOG 


3-15 


CMPL.C 


10-23 


CMPL.D 


10-21 


CMPL.G 


10-22 


CMPL.I 


10-19 


CMPLX 


10-20 


CONJ 


15-5 


COS 


5-7 


COSD 


5-9 


COSH 


7-4 


COTAN 


5-33 


CSIN 


5-19 


CSQRT 


2-9 


DABS 


9-5 


DACOS 


6-7 


DASIN 


6-5 


DATAN 


6-17 


DATAN2 


6-19 



DBLE 



10-12 



DCOS 


5-13 


DCOSH 


7-7 


DCOTAN 


5-37 


DDIM 


12-11 


DEXP 


4-5 



complex, double-precision, G-floating-point exponential 

complex, double-precision, G-floating-point natural 
logarithm 

complex, double-precision, G-floating-point sin 

complex, double-precision, G-floating-point square root 

complex natural logarithm 

conversion of two complex numbers to one complex number 

conversion of two double-precision, D-floating-point 
numbers to complex format 

conversion of two double-precision, G-floating-point num- 
bers to complex format 

conversion of two integers to complex format 

conversion of two single-precision numbers to complex 
format 

complex conjugate 

cosine (angle in radians) 

cosine (angle in degrees) 

hyperbolic cosine 

cotangent 

complex sine 

complex square root 

double-precision, D-floating-point absolute value 

double-precision, D-floating-point arc cosine 

double-precision, D-floating-point arc sine 

double-precision, D-floating-point arc tangent 

double-precision, D-floating-point polar angle of a point in 
the x-y plane 

conversion from single-precision to double-precision, 
D-floating-point format 

double-precision, D-floating-point cosine 

double-precision, D-floating-point hyperbolic cosine 

double-precision, D-floating-point cotangent 

double-precision, D-floating-point positive difference 

double-precision, D-floating-point exponential 



(continued on next page) 
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Table 1-1 (cont.): Math Library Routines 



Routine Name 


Page 


DEXP 


4-5 


DEXP2. 


4-18 



DEXP3. 



DTOGA 



EXP3. 



4-28 



DFLOAT 


10-11 


DIM 


12-10 


DINT 


11-10 


DLOG 


3-7 


DLOG10 


3-9 


DMAX1 


14-7 


DMIN1 


14-13 


DMOD 


12-7 


DNINT 


11-7 


DPROD 


12-3 


DSIGN 


13-5 


DSIN 


5-11 


DSINH 


7-5 


DSQRT 


2-5 


DTAN 


5-35 


DTANH 


7-12 


DTOG 


10-17 



10-18 



EXP 


4-3 


EXP1. 


4-15 


EXP2. 


4-16 



4-25 



FLOAT 


10-8 


GABS 


9-6 


GACOS 


6-11 


GASIN 


6-9 


GATAN 


6-21 



Purpose 

double-precision, D-floating-point exponential 

exponentiation of a double-precision, D-floating-point 
number to the power of an integer 

exponentiation of a double-precision, D-floating-point 
number to the power of another double-precision, 
D-floating-point number 

conversion of an integer to double-precision, 
D-floating-point format 

positive difference 

double-precision, D-floating,point truncation 

double-precision, D-floating-point natural logarithm 

double-precision, D-floating-point base-10 logarithm 

double-precision, D-floating-point largest in a series 

double-precision, D-floating-point smallest in a series 

double-precision, D-floating-point remainder 

double-precision, D-floating-point nearest whole number 

double-precision, D-floating-point product 

double-precision, D-floating-point transfer of sign 

double-precision, D-floating-point sine 

double-precision, D-floating-point hyperbolic sine 

double-precision, D-floating-point square root 

double-precision, D-floating-point tangent 

double-precision, D-floating-point hyperbolic tangent 

conversion of a double-precision, D-floating-point number 
to double-precision, G-floating-point format 

conversion of an array of double-precision, D-floating-point 
numbers to double-precision, G-floating-point format 

exponential 

exponentiation of an integer to the power of another integer 

exponentiation of a single-precision number to the power of 
an integer 

exponentiation of a single-precision number to the power of 
another single-precision number 

conversion of an integer to single-precision format 

double-precision, G-floating-point absolute value 

double-precision, G-floating-point arc cosine 

double-precision, G-floating-point arc sine 

double-precision, G-floating-point arc tangent 

(continued on next page) 
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Table 1-1 (cont.): Math Library Routines 



Routine Name 


Page 


GATAN2 


6-23 


GCOS 


5-17 


GCOSH 


7-10 


GCOTAN 


5-41 


GDB.n 


10-16 


GDIM 


12-12 


GEXP 


4-7 


GEXP2. 


4-20 



GEXP3. 

GFL.n 

GFX.n 

GINT. 

GLOG 

GLOG10 

GMAX1 

GMIN1 

GMOD 

GNINT. 

GPROD. 

GSIGN 

GSIN 

GSINH 

GSN.n 

GSQRT 
GTAN 
GTANH 
GTOD 

GTODA 



4-31 

10-15 

10-6 

11-11 
3-11 
3-13 
14-8 
14-14 
12-8 
11-8 
12-4 
13-6 
5-15 
7-8 
10-10 

2-7 

5-39 

7-13 

10-13 

10-14 



Purpose 

double-precision, G-floating-point polar angle of a point in 
the x-y plane 

double-precision, G-floating-point cosine 

double-precision, G-floating-point hyperbolic cosine 

double-precision, G-floating-point cotangent 

conversion of a single-precision number to 
double-precision, G-floating-point format 

double-precision, G-floating-point positive difference 

double-precision, G-floating-point exponential 

exponentiation of a double-precision, G-floating-point 
number to the power of an integer 

exponentiation of a double-precision, G-floating-point 
number to the power of another double-precision, G-float- 
ing-point number 

conversion of an integer to double-precision, 
G-floating-point format 

conversion of a double-precision, G-floating-point number 
to integer format 

double-precision, G-floating-point truncation 

double-precision, G-floating-point natural logarithm 

double-precision, G-floating-point base-10 logarithm 

double-precision, G-floating-point largest of a series 

double-precision, G-floating-point smallest of a series 

double-precision, G-floating-point remainder 

dpuble-precision, G-floating-point nearest whole number 

double-precision, G-floating-point product 

double-precision, G-floating-point transfer of sign 

double-precision, G-floating-point sine 

double-precision, G-floating-point hyperbolic sine 

conversion of a double-precision, G-floating-point number 
to single-precision format 

double-precision, G-floating-point square root 

double-precision, G-floating-point tangent 

double-precision, G-floating-point hyperbolic tangent 

conversion of a double-precision, G-floating-point number 
to double-precision, D-floating-point format 

conversion of an array of double-precision, G-floating-point 
numbers to double-precision, D-floating-point format 

(continued on next page) 
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Table 1-1 (cont.): Math Library Routines 



Routine Name Page 



IABS 
IDIM 
IDINT 

IDNINT 

IFIX 
IGNIN. 

INT 

ISIGN 

MAXO 

MAX1 

MINO 

MINI 

MOD 

NINT 

RAN 

RANS 

REAL 

REAL.C 

SAVRAN 

SETRAN 

SIGN 

SIN 

SIND 

SINH 

SNGL 

SQRT 

TAN 
TANH 



9-3 
12-9 
10-5 

11-4 

10-3 
11-5 

10-4 

13-3 

14-3 

14-4 

14-9 

14-10 

12-5 

11-3 

8-3 

8-5 

10-7 

15-3 

8-7 

8-6 

13-4 

5-3 

5-5 

7-3 

10-9 

2-3 

5-31 

7-11 



Purpose 

integer absolute value 

integer positive difference 

conversion of a double-precision, D -floating- point number 
to integer format 

integer nearest whole number for a double-precision, 
D-floating-point number 

conversion of a single-precision number to integer format 

integer nearest whole number for a double-precision, 
G-floating-point number 

conversion of a single-precision number to integer format 

integer transfer of sign 

largest of a series 

largest of a series 

smallest of a series 

smallest of a series 

integer remainder 

integer nearest whole number for a single -precision number 

random number generator 

random number generator with shuffling 

conversion of an integer to single-precision format 

real part of a complex number 

save the seed for the last random number generated 

set the seed value for the random number generator 

transfer of sign 

sine (angle in radians) 

sine (angle in degrees) 

hyperbolic sine 

conversion of a double-precision, D-floating-point number 
to single-precision format 

square root 

tangent 

hyperbolic tangent 



The routines in this library are available to most of the languages available 
with TOPS-10 and TOPS-20. Consult the applicable language manual for 
specific information on how to use the Math Library. Although all of the 
routines listed in Table 1-1 exist in the library, not all of them can be called 
from all languages. That is, some languages or compilers have restrictions 
that disallow calling of a particular routine from a user program. For example, 
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the complex data type does not exist in PASCAL, so the routines that perform 
complex mathematics are never called by a PASCAL program. However, a 
compiler may itself call a routine because a user program has a statement that 
necessitates use of a Math Library routine. For example, a FORTRAN pro- 
gram cannot call any of the routines whose names contain a period (.). How- 
ever, the compiler recognizes when a statement within a program requires use 
of one of those routines, and the compiler calls the appropriate routine. Simi- 
larly, a statement in an APL program may require a mathematical function, 
so the APL interpreter translates that statement into a call to the appropriate 
Math Library routine. 

1.2 Math Symbols and Names Used in Equations 

Throughout this manual, certain mathematical symbols and names are used 
to indicate values, quantities, actions, or states. These symbols and their 
meanings are listed below. 



= 


equal to 


+ 


plus 


- 


minus 


• 


multiplied by (used in equations) 


X 


multiplied by (used in numbers) 


/ 


divided by 


> 


greater than 


> 


greater than or equal to 


< 


less than 


< 


less than or equal to 


¥= 


not equal to 


\/ 


square root 


7T 


Pi (3.14159265358979323846264950338327) 


+ 


plus or minus 


[] 


greatest integer in 


II 


absolute value 


^ 


equals approximately 


x y 


subscript 


x y 


superscript or raised to the power 


log e 


natural logarithm 


logio 


base- 10 logarithm 


i 


imaginary number (\pT) 


e x 


exponential 


sin 


sine of an angle 


cos 


cosine of an angle 


tan 


tangent of an angle 


cot 


cotangent of an angle 


sin"" 1 


arc sine 


cos" 1 


arc cosine 


tan" 1 


arc tangent 


sinh 


hyperbolic sine 


cosh 


hyperbolic cosine 


tanh 


hyperbolic tangent 


sgn 


sign of 


conj 


complex conjugate 
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In addition, some equations use the names of routines to indicate a state or 
action. These routines and their meanings are as follows. 

FLOAT convert and round from an integer to a single-precison, floating- 
point number 

INT convert and truncate from a single-precision, floating-point num- 

ber to an integer 

MAX largest of a series 

MIN smallest of a series 

MOD remainder 

Each of these routines is described in detail in this manual. 

Also, machine infinity (or infinity) is a term used to indicate the largest or 
smallest number representable in the machine. 

+machine infinity = 377777777777 8 for single-precision 

377777777777, 377777777777 8 for double-precision 

-machine infinity = 400000000000 8 for single precision 

400000000000, 000000000001 8 for double-precision 

1.3 Data Types and Their Precision 

The Common Math Library routines can handle several data types — integer; 
single-precision, floating-point (also called real); double-precision, D-floating- 
point; double-precision, G-floating-point; complex; complex, double-preci- 
sion, D-floating-point; and complex, double-precision, G-floating-point. Each 
data type is described in detail in one of the following sections. 

1.3.1 Integer 

An integer value is a string of one to eleven digits that represents a whole 
decimal number (a number without a fractional part). Integer values must be 
within the range of -2 3ft to +2 35 -l (-34359738368 to +34359738367). 

1.3.2 Single-Precision, Floating-Point 

Single-precision, floating-point values may be of any size; however, each will 
be rounded to fit the precision of 27 bits (7 to 9 decimal digits). 

Precision for single-precision, floating-point values is maintained to approxi- 
mately eight significant digits; the absolute precision depends upon the num- 
bers involved. 

The range of magnitude permitted a single-precision, floating-point value is 
from approximately 1.47xl0~ 39 to 1.70xl0 +38 . 



1-10 TOPS-10/TOPS-20 Common Math Library Reference Manual 



1.3.3 Double-Precision, D-Floating-Point 

Double-precision, D-floating-point values are similar to single-precision, 
floating-point values; the differences between these two values are: 

• Double-precision, D-floating-point values, depending on their magnitude, 
have precision of 62 bits, rather than the 27-bit precision obtained for sin- 
gle-precision, floating-point values. 

• Each double-precision, D-floating-point value occupies two storage loca- 
tions. 

The range of magnitude permitted a double-precision, D-floating-point value 
is from approximately 1.47xl(T 39 to 1.70xl0 +38 . 

1.3.4 Double-Precision G-Floating-Point 1 

Double-precision, G-floating-point values are similar to double-precision, 
D-floating-point values. They differ in: 

• the number of bits of exponent 

• the number of bits of mantissa 

• the range of numbers they can represent 

• the digits of precision 

Table 1-2 summarizes the differences among single-precision and the two 
forms of double-precision. 

Table 1-2: Comparison of Single-Precision, D-Floating-Point, and 
G-Floating-Point 





Bits of 


Bits of 


Range 


Digits of 
Precision 




Exponent 


Mantissa 






single-precision 


8 


27 


1.47xl(T 39 . 
to 1.70xl0 +38 


8.1 


D-floating-point 


8 


62 


1.47X10"" 39 
to 1.70xl0 +38 


18.7 


G-floating-point 


11 


59 


2.78X10" 309 
to 8.99xl0 +:!()7 


17.8 



1 Double-precision, G-floating-point data type is available only with TOPS-20 Version 5 (or 
later) on the DECSYSTEM-20 KL10 model B. 
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1.3.5 Complex 

A complex value contains two numbers; it is assumed that the first (leftmost) 
value of the pair represents the real part of the number and that the second 
value represents the imaginary part of the number. The values that represent 
the real and imaginary parts of a complex value occupy two consecutive 
storage locations. 

1.3.6 Complex, Double-Precision 

You can use two types of complex, double-precision values — D-floating-point 
and G-floating-point. Both are assumed to be double-precision arrays with 
two elements. The first element is the real part, and the second element is the 
imaginary part. 



1.4 Information About the Routines 

Each routine described in this manual has the following information provided. 

• A short description 

• The names of other routines called by the routine 

• The data type and range of the argument(s) 

• The data type and range of the result 

• The accuracy of the result 

• The algorithm used to calculate the result 

• A reference to any text used for information about the algorithm (where 
applicable) 

• Any error conditions and the messages that result 

Some additional information about the routines not included in each write-up 
is: 

• Calling sequence 

• Entry points 

• Return location(s) 

• Register usage 

This information is described below. It is not included for each routine be- 
cause it is identical for most routines and is relevant only for MACRO and 
BLISS users. 
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1.4.1 Calling Sequence 

Most routines are called by an identical calling sequence. This calling se- 
quence is: 

XMOVEI L,ARG 
PUSHJ P, routine-name 

ARG is the address of the argument block. L is the pointer to the argument 
list for the routine; it is AC16. P is the stack pointer; it is AC17. Note that the 
contents of L (AC 16) are not preserved. 

For example, the SQRT routine is called by: 

XMOVEI 16,ARG 
PUSHJ 17,SQRT 

Those routines called by a different calling sequence contain the calling se- 
quence in their descriptions. 

1.4.2 Entry Points 

In most cases each routine has at least two entry points — its name and its 
name followed by a period. For example, SQRT and SQRT. are entry points 
for the SQRT routine. The name with the period is the one used by the 
FORTRAN compiler. Some routines have additional entry points because 
they perform more than one function. Thus, one routine calculates both sine 
and cosine, so SIN, SIN., COS, and COS. are all entry points into that 
routine. If you are calling a routine from a MACRO or BLISS program, you 
can use the name of the routine as the entry point; it will always work. 

1.4.3 Return Location 

The result of the calculation of most routines is returned to one or two regis- 
ters. For integer and single-precision results, the return location is register 0. 
For double-precision and complex (single-precision) results, the return loca- 
tions are registers and 1. For complex, double-precision results, the return 
location must be specified as the second argument included in the call to the 
routine. The requirements for the arguments included in the call are included 
with each write-up of the complex, double-precision routines. 

1.4.4 Register Usage 

All the routines have similar register usage. Some may use more registers than 
others, however. As stated above, registers and 1 are used for the return 
locations; therefore the original contents of one or both are lost on return from 
a routine. These registers are also occasionally used to store the argument 
initially. Registers 2 through 15 are saved, used, and restored. The number of 
such registers used depends on the routine. 
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1.5 Accuracy Tests 



Each routine contains a section headed "Accuracy of Result." The accuracy 
figures were obtained from the tests described below. These tests were run 
with typical values for arguments. There may be unusual arguments that 
could cause larger errors; for example, if you get too close to a threshold that 
could cause overflow or underflow, larger errors can occur. The format of the 
accuracy section is as follows. Note that the elements are explained with the 
descriptions of the tests. 



Accuracy of Result 

test interval 
MRE 
RMS 

LSB error distribution: 



0.00000 through 1.0000 
1.55x10"* (25.9 bits) 
3.76X10 9 (28.0 bits) 



-2 

0% 



-1 

8% 





83% 



+1 

9% 



+2 
0% 



To test a routine, several representative intervals for each routine were cho- 
sen. Sample values were then chosen randomly from each interval, approxi- 
mately 200,000 for single-precision and 20,000 for double-precision. Each rou- 
tine was then called using these values. The relative error of each result was 
then obtained by the following equation. 



For example: 



actual exact result - result of routine 
actual exact result 



sjn(x) - SIN(x) 



sin(x) 



The test computed the maximum relative error (MRE) and the average rela- 
tive error, called the root mean square (RMS). To interpret the MRE and 
RMS, consider an "exact" routine, one that always returns an exact result 
rounded to machine precision. Such a routine would show a maximum rela- 
tive error of 2" 27 for single-precision; 2~ 62 for double-precision, D-floating- 
point; and 2" 59 for double-precision, G-floating-point. To make the MRE and 
RMS more understandable in terms of bits of accuracy, the tests also give the 
number of bits of accuracy by finding the negative base-2 logarithm of the 
MRE and RMS. For the "exact" routine, the negative base-2 logarithm of the 
MRE would be 27 for single-precision; 62 for double-precision, D-floating- 
point; and 59 for double-precision, G-floating-point. The negative base-2 loga- 
rithm of the RMS error from an "exact" routine would be about 28.3, 63.3, 
and 60.3, respectively. These numbers are slightly larger than those for the 
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MRE because they reflect the RMS average of the "worst case" of exactness 
(only 27 or 62 or 59 bits correct) and the "best case" (infinite bits correct). 
Therefore, the closer the number of bits of accuracy of a routine approaches 
that of an "exact" routine, the more accurate the routine. The accuracy 
figures for "exact" routines for the three levels of precision are as follows. 



Single-Precision 

test interval 
MRE 
RMS 

LSB error distribution: 



0.00000 through 8192.0 
7.44X10" 9 (27.0 bits) 
3.11X10 9 (28.3 bits) 



-2 

0% 



-1 
0% 



+1 

100% 0% 



+2 
0% 



Double-precision, D-floating-point 



test interval 
MRE 
RMS 

LSB error distribution: 



-infinity to + infinity 
2.17xl0 19 (62.0 bits) 
8.81xl0~ 20 (63.3 bits) 



-2 

0% 



-1 
0% 



+1 

100% 0% 



+2 
0% 



Double-precision, G-floating-point 



test interval 
MRE 
RMS 

LSB error distribution: 



-infinity to + infinity 
1.73X10" 18 (59.0 bits) 
7.05X10 19 (60.3 bits) 



-2 

0% 



-1 
0% 



+1 

100% 0% 



+2 
0% 



A second test compared the result of the routines with the exact result 
rounded to single- or double-precision. It counted the number of times the 
routine's result agreed exactly with the rounded exact result, the number of 
times they differed by ±1 bit, ±2 bits, and so on. The result of these compari- 
sons is expressed as a percent of error distribution for the least significant bit 
(LSB). 

Appendix A shows accuracy results derived from the ELEFUNT tests of W. J. 
Cody, Argonne National Laboratory. These tests show accuracy derived by 
testing carefully-chosen identities for each function. This appendix is pro- 
vided for your information, not for comparison with the test results described 
above. Such a comparison would not be meaningful. 
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Chapter 2 

Square Root Routines 



SQRT 



Description 

The SQRT routine calculates the single-precision, floating-point square root 
of its single-precision, floating-point argument. That is: 

SQRT(x) =v6T=x* 

Routines Called 

SQRT calls the MTHERR routine. 

Type of Argument 

The argument must be a single-precision, floating-point value greater than or 
equal to 0.0. 

Type of Result 

The result returned is a single-precision, floating-point value greater than or 
equal to 0.0. 



Accuracy of Result 






test interval: 


0.00000 through 8192.0 




MRE: 


8.09xl0 9 (26.9 bits) 




RMS: 


3.21X10" 9 (28.3 bits) 




LSB error distribution: 


-2 -1 +1 

0% 0% 98% 2% 


+2 
0% 



Algorithm Used 

SQRT(x) is calculated as follows. 

First the routine does a linear, single-precision approximation on the argu- 
ment to provide an initial guess for Vx". The routine then does two iterations 
of the Newton-Raphson method, which results in an answer that is correct to, 
but not always including, the last bit. 

If x < 0.0 

SQRT(x) = SQRT(lxl) 

If x = 0.0 

SQRT(x) = 0.0 

If x > 0.0 

Let x = 2 2b -f where .25 < f < 1.0 
then Vx = 2 b *N/f 
and z =2 b '(af-b) 

a = .82812500 if .25 < f < .5 

= .58593750 if .5<f < 1.0 
b= .29722518 if .25<f<.5 

= .42060167 if .5<f < 1.0 
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The Newton-Raphson method, as applied to the SQRT function, yields the 
following iterative approximation. 

z k+1 = l/2«(z k +x/z k ) 

z k+1 = the next iteration 

z k = the current iteration 

x = the number whose square root is being calculated 

z = the initial approximation calculated by the linear approxima- 
tion 

Error Conditions 

If the argument is negative, the following message is issued and the absolute 
value of the argument is used. 

SQRT: Negative arg; result = SQRT(ABS(arg)) 
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DSQRT 



Description 

The DSQRT routine calculates the double-precision, D -floating- point square 
root of its double-precision, D-floating-point argument. That is: 

DSQRT(x) = Vx = x* 

Routines Called 

DSQRT calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, D-floating-point value greater than 
or equal to 0.0. 

Type of Result 

The result returned is a double-precision, D-floating-point value greater than 
or equal to 0.0. 



Accuracy of Result 






test interval: 


0.00000 through 8192.0 




MRE: 


3.25xl0 19 (61.4 bits) 




RMS: 


1.23xl0~ 19 (62.8 bits) 




LSB error distribution: 


-2 -1 +1 

0% 0% 75% 25% 


+2 
0% 



Algorithm Used 

DSQRT(x) is calculated as follows. 

First the routine does a linear, single-precision approximation on the high- 
order word. Then the routine does two single-precision iterations of the New- 
ton-Raphson method, followed by two double-precision iterations of the New- 
ton-Raphson method using a value derived from the linear approximation. 

The linear approximation is as follows. 

If x < 0.0 

DSQRT(x) = DSQRT(lxl) 

If x = 0.0 

DSQRT(x) = 0.0 

If x > 0.0 

Let x = 2 2b 'f where .25 < f < 1.0 
then Vx = 2 b 'Vf 
and zo = 2 b *(af-b) 

a= .82812500 if .25<f<.5 

= .58593750 if .5 <f< 1.0 
b = .29722518 if .25<f<.5 

= .42060167 if .5<f < 1.0 
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The Newton-Raphson method yields the following iterative approximation. 

z k+1 = l/2*(z k +x/z k ) 

z k+1 = the next iteration 

z k = the current iteration 

x = the number whose square root is being calculated 

z = the initial approximation calculated by the linear approxima- 
tion 

For the single-precision approximations, x is truncated to single-precision and 
all calculations are done in single-precision. For the double-precision itera- 
tions, the full double-precision value of x is used, the current value of z 2 is 
zero-extended to double-precision, and all remaining calculations are done in 
double-precision. 

Error Conditions 

If the argument is negative, the following message is issued and the absolute 
value of the argument is used. 

DSQRT: Negative arg; result = DSQRT(ABS(arg)) 
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GSQRT 



Description 

The GSQRT routine calculates the double-precision, G-floating-point square 
root of its double-precision, G-floating-point argument. That is: 

GSQRT(x) = v^T - x* 

Routines Called 

GSQRT calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, G-floating-point value greater than 
or equal to 0.0. 

Type of Result 

The result returned is a double-precision, G-floating-point value greater than 
or equal to 0.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 8192.0 
2.60x10 18 (58.4 bits) 
9.87X10 19 (59.8 bits) 

-2 -1 +1 +2 

0% 0% 75% 25% 0% 



Algorithm Used 

GSQRT(x) is calculated as follows. 

First the routine does a linear, single-precision approximation on the high- 
order word. Then the routine does two single-precision iterations of the New- 
ton-Raphson method, followed by two double-precision iterations of the New- 
ton-Raphson method using a value derived from the linear approximation. 

The linear approximation is as follows. 

If x < 0.0 

GSQRT(x) = GSQRT(lxl) 

If x = 0.0 

GSQRT(x) = 0.0 

If x > 0.0 

Let x = 2 2b -f where .25 < f < 1.0 
thenVx = 2 b -vT 
and z = 2 b «(af-b) 

a = .82812500 if .25<f<.5 
a = .58593750 if .5<f< 1.0 
b = .29722518 if .25 < f < .5 
b= .42060167 if .5 <f< 1.0 
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The Newton-Raphson method yields the following iterative approximation. 

z k+1 = 1/2 •(z k + x/z k ) 

z k+1 = the next iteration 

z k = the current iteration 

x = the number whose square root is being calculated 

z = the initial approximation calculated by the linear approxima- 
tion 

For the single-precision approximations, x is truncated to single-precision and 
all calculations are done in single-precision. For the double-precision itera- 
tions, the full double-precision value of x is used, the current value of z 2 is 
zero-extended to double-precision, and all remaining calculations are done in 
double-precision. 

Error Conditions 

If the argument is negative, the following message is issued and the absolute 
value of the argument is used. 

GSQRT: Negative arg; result = GSQRT(ABS(arg)) 
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CSQRT 



Description 

The CSQRT routine calculates the complex, single-precision square root of its 
complex, single-precision argument. That is: 

CSQRT(z) =Vz = z* 

Routines Called 

CSQRT calls the SQRT and MTHERR routines. 

Type of Argument 

The argument must be a complex, single-precision, floating-point value; it 
can be any such value. 

Type of Result 

The result returned is a complex, single-precision, floating-point value, the 
real part of which is greater than or equal to 0.0. 

Accuracy of Result 

. ^ . . , -1000.0 through 1000.0 real 
test interval: « nM n ,, . « Mn n . 



-1000.0 through 1000.0 imaginary 

3.07X1O" 8 (25.0 bits) real 
3.05xl(T 8 (25.0 bits) imaginary 

7.05X10" 9 (27.1 bits) real 
7.33X10" 9 (27.0 bits) imaginary 



-2 


-1 





+1 


+2 


2% 


16% 


59% 


20% 


2% real 


2% 


19% 


55% 


20% 


3% imaginary 



MRE: 
RMS: 

LSB error distribution: 



Algorithm Used 

CSQRT(z) is calculated as follows. 

Let z = x+i # y 

then CSQRT(z) = u+i*v, which is defined as follows. 

If x ^ 0.0 

U = V(lxl+lzl)/2.0 
v = y/(2.0-u) 

If x<0.0 andy>0.0 
u = y /(2.0*v) 

V = V(lxl+lzl)/2.0 

If x and y are both < 0.0 
u = y/ (2.0'v) _ 

V = -V(lxl+lzl)/2.0 

The result is in the right half plane; that is, the polar angle of the result lies in 
the closed interval [-7r/2,+7r/2]. That is, the real part of the result is greater 
than or equal to 0.0. 
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Error Conditions 

If the imaginary part of the input value is too small, underflow can occur on 
y/(2.0»u) or y/(2.0*v). If such underflow occurs, one of the following messages 
is issued and the relevant part of the result is set to 0.0. 

CSQRT: Real part underflow 
CSQRT: Imaginary part underflow 
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CDSQRT 



Description 

The CDSQRT subroutine calculates the complex, double-precision, D-float- 
ing-point square root of its complex, double-precision, D-floating-point argu- 
ment. That is: 

CDSQRT(z,r) = Vz = z* 

z = location of input value 
r = location of result 

Routines Called 

CDSQRT calls the DSQRT and MTHERR routines. 

Type of Arguments 

CDSQRT is a subroutine that is called with two arguments. Both arguments 
must be two-element, double-precision vectors. The first vector (z) contains 
the input value; the second vector (r) will contain the result. The real part of 
the input value must be stored in the first element of z; the imaginary part 
must be stored in the second element of z. The input value must be a com- 
plex, double-precision, D-floating-point value; it can be any such value. 

Type of Result 

The result returned is a complex, double-precision, D-floating-point value, 
the real part of which is greater than or equal to 0.0. It is returned in the 
second vector (r) supplied in the call. The real part of the result is returned in 
the first element of r; the imaginary part is returned in the second element 
of r. 



Accuracy of Result 

test interval: 

MRE: 
RMS: 



-1000.0 through 1000.0 real 
-1000.0 through 1000.0 imaginary 

1.10x10 18 (59.7 bits) real 
1.04xl0~ 18 (59.7 bits) imaginary 

2.69xl0 19 (61.7 bits) real 
2.75xl0~ 19 (61.7 bits) imaginary 



LSB error distribution: 



-2 


-1 





+ 1 


+2 


4% 


17% 


43% 


32% 


5% real 


5% 


24% 


41% 


25% 


5% imaginary 
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Algorithm Used 

CDSQRT is calculated as follows. 

Let z = x+i*y 

then CDSQRT(z) = u+i*v, which is defined as follows. 

If x > 0.0 



U = V(lxl+lzl)/2.0 
v = y/(2.0-u) 

If x < 0.0 and y > 0.0 
u = y /(2.0-v) 

V = V(lxl+lzl)/2.0 

If x and y are both < 0.0 
u = y/ (2.0'v) 

V = -V(lxf+lzl)/2.0 

The result is in the right half plane; that is, the polar angle of the result lies in 
the closed interval Hr/2,+ir/2]. That is, the real part of the result is greater 
than or equal to 0.0. 

Error Conditions 

If the imaginary part of the input value is too small, underflow can occur on 
y/(2.0*u) or y/(2.0-v). If such underflow occurs, one of the following messages 
is issued and the relevant part of the result is set to 0.0. 

CDSQRT: Real part underflow 
CDSQRT: Imaginary part underflow 
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CGSQRT 



Description 

The CGSQRT subroutine calculates the complex, double-precision, G-float- 
ing-point square root of its complex, double-precision, G-floating-point argu- 
ment. That is: 

CGSQRT(z,r) = Vz = z* 

z = location of input value 
r = location of result 

Routines Called 

CGSQRT calls the GSQRT and MTHERR routines. 

Type of Argument 

CGSQRT is a subroutine that is called with two arguments. Both arguments 
must be two-element, double-precision vectors. The first vector (z) contains 
the input value; the second vector (r) will contain the result. The real part of 
the input value must be stored in the first element of z; the imaginary part 
must be stored in the second element of z. The input value must be a com- 
plex, double-precision, G-floating-point value; it can be any such value. 

Type of Result 

The result returned is a complex, double-precision, G-floating-point value; it 
may be any such value. It is returned in the second vector (r) supplied in the 
call. The real part of the result is returned in the first element of r; the 
imaginary part is returned in the second element of r. 



Accuracy of Result 

test interval: 

MRE: 
RMS: 



-1000.0 through 1000.0 real 
-1000.0 through 1000.0 imaginary 

8.61X10 18 (56.7 bits) real 
8.78xl0" 18 (56.7 bits) imaginary 

2.16X10 18 (58.7 bits) real 
2.21x10 18 (58.7 bits) imaginary 



LSB error distribution: 



-2 


-1 





+1 


+2 


5% 


16% 


41% 


32% 


5% real 


5% 


25% 


40% 


25% 


5% imaginary 
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Algorithm Used 

CGSQRT(z) is calculated as follows. 

Let z = x+i*y 

then CGSQRT(z) = u+i'v is defined as follows. 

If x>0.0 



U = V(lxl+lzl)/2.0 
v = y/(2.0-u) 

If x < 0.0 and y > 0.0 
u = y /(2.0-v) 

V = V(lxl+lzl)/2.0 

If x and y are both < 0.0 
u = y/ (2.0-v) 

V = -V(lxl+lzl)/2.0 

The result is in the right half plane; that is, the polar angle of the result lies in 
the closed interval [-7r/2,+7r/2]. 

Error Conditions 

If the imaginary part of the argument is too small, underflow can occur on 
y/(2.0*u) or y/(2.0*v). If this occurs, one of the following messages is issued 
and the relevant part of the result is set to 0.0. 

CGSQRT: Real part underflow 
CGSQRT: Imaginary part underflow 
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Chapter 3 
Logarithm Routines 



ALOG 



Description 

The ALOG routine calculates the single-precision, floating-point natural loga- 
rithm of its argument. That is: 

ALOG(x) = log e (x) 

Routines Called 

ALOG calls the MTHERR routine. 

Type of Argument 

The argument must be a single-precision, floating-point value greater than 
0.0. 

Type of Result 

The result returned is a single-precision, floating-point value in the range 
-89.415 to 88.029. 



Accuracy of Result 

test interval: 


1.46937xl0" 39 through 256.00 


MRE: 


1.84X10" 8 (25.7 bits) 


RMS: 


5.21xl0~ 9 (27.5 bits) 


LSB error distribution: 


-2 -1 +1 +2 

0% 1% 81% 18% 0% 



Algorithm Used 

ALOG(x) is calculated as follows. 

If x = 0.0 

ALOG(x) = -machine infinity 

If x < 0.0 

ALOG(x) - ALOG(lxl) 

If x is close to 1.0 

ALOG(x) = L3-z 7 +L4-z 5 +L5'z 3 +L6-z 
z = (x-l)/(x+l) 
L3 = .301003281 
L4 = .39965794919 
L5 = .666669484507 
L6 = 2.0 

If x is not close to 1.0 

ALOG(x) = (k-.5)-log e (2)+log e (f-v£) 
x = 2 k -f 
log e (f- V2 ) = L3«z 7 +L4«z 5 +L5«z 3 +L6'z 

Z = (f-s/I)/(Us/J) 
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Reference 

Hart et. al., Computer Approximations, (New York, N.Y.: John Wiley and 
Sons, 1968). 

The algorithm used is #2662, the coefficients are listed on page 193, and the 
range of validity is on page 111. 

Error Conditions 

1. If the argument is equal to 0.0, the following message is issued and the 
result is set to -machine infinity. 

ALOG: Arg is zero; result = -infinity. 

2. If the argument is less than 0.0, the following message is issued and the 
absolute value of the argument is used. 

ALOG: Negative arg, result = ALOG(ABS(arg)) 
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ALOG10 



Description 

The ALOG10 routine calculates the single-precision, floating-point base- 10 
logarithm of its single-precision, floating-point argument. That is: 

ALOG10(x) = log 10 (x) 

Routines Called 

ALOG10 calls the MTHERR routine. 

Type of Argument 

The argument must be a single-precision, floating-point value greater than 
0.0. 

Type of Result 

The result returned is a single-precision, floating-point value in the range 
-38.832 to 38.230. 



Accuracy of Result 

test interval: 


1.46937xl0 39 through 256.00 


MRE: 


2.52x10 s (25.2 bits) 


RMS: 


5.99X10 9 (27.3 bits) 


LSB error distribution: 


-2 -1 +1 +2 
1% 19% 64% 15% 0% 



Algorithm Used 

ALOG10(x) is calculated as follows. 

If x = 0.0 

ALOG10(x) = -machine infinity 

If x < 0.0 

ALOG10(x) = ALOGlO(lxl) 

If x is close to 1.0 

ALOG10(x) = log e (x)-log 10 (e) 

log e (x) = l3'z 7 +L4*z 5 +l5*z 3 +l6*z 
z = (x-l)/(x+l) 
L3 - .301003281 
L4 = .39965794919 
L5 = .666669484507 
L6 = 2.0 

If x is not close to 1.0 

ALOG10(x) = log e (xWog 10 (e) 
x = 2 k, f 
log e (x) = (k-.5)-log B (2)+log B (f-v£) 

log e (f«VS") = L3*z 7 +L4*z 5 +L5'Z 3 +L6*z 
z = (f-VE W+yfb ) 
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Reference 

Hart et. al, Computer Approximations, (New York, N.Y.: John Wiley and 
Sons, 1968). The algorithm used is #2662, the coefficients are listed on page 
193, and the range of validity is on page 111. 

Error Conditions 

1. If the argument is 0.0, the following message is issued and the result is set 
to -machine infinity. 

ALOG10: Arg is zero; result = -infinity 

2. If the argument is less than 0.0, the following message is issued and the 
absolute value of the argument is used. 

ALOG10: Negative arg; result = ALOG10(ABS(arg)) 
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DLOG 



Description 

The DLOG routine calculates the double-precision, D-floating-point natural 
logarithm of its double-precision, D-floating-point argument. That is: 

DLOG(x) = log e (x) 

Routines Called 

DLOG calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, D-floating-point value greater than 
0.0. 

Type of Result 

The result returned is a double-precision, D-floating-point value in the range 
-89.415 to 88.029. 



Accuracy of Result 

test interval: 


1.46937xl0~ 39 through 256.00 


MRE: 


9.78xl0 19 (59.8 bits) 


RMS: 


3.03xl0- 19 (61.5 bits) 


LSB error distribution: 


-2 -10 +1 +2 

1% 12% 51% 23% 13% 


Algorithm Used 

DLOG(x) is calculated as follows. 


If x = 0.0 

DLOG(x) = -machine infinity 


If x < 0.0 

DLOG(x) = DLOG(lxl) 





If x > 0.0 

x = 2 k, f where .5 < f < 1.0 
and g and n are defined so that 
f = 2~ n *g where l/v5"< g < V2 
Then DLOG(x) = (k-n)-log e (2)+log e (g) 
log e (g) is evaluated by defining 
s = (g -l)/(g+l) and 
z = 2-s 
and then calculating 

l°g e (g) = log e ((l+z/2)/(l -z/2)) using a minimax 
rational approximation. 
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Error Conditions 

1. If the argument is equal to 0.0, the following message is issued and the 
result is set to -machine infinity. 

DLOG: Arg is zero; result = -infinity 

2. If the argument is less than 0.0, the following message is issued and the 
absolute value of the argument is used. 

DLOG: Negative arg; result = DLOG(ABS(arg)) 
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DLOG10 



Description 

The DLOG10 routine calculates the double-precision, D-floating-point base- 
10 logarithm of its double-precision D-floating-point argument. That is: 

DLOG10(x) = log 10 (x) 

Routines Called 

DLOG10 calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, D-floating-point value greater than 
0.0. 

Type of Result 

The result returned is a double-precision, D-floating-point value in the range 
-38.832 to 38.320. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



1.46937X10" 39 through 256.00 

1.20x10 18 (59.5 bits) 

3.65xia 19 (61.2 bits) 

-2 -1 +1 +2 +3 

3% 17% 38% 26% 14% 2% 



Algorithm Used 

DLOG10(x) is calculated as follows. 

If x = 0.0 

DLOG10(x) = -machine infinity 

If x < 0.0 

DLOG10(x) = DLOGlO(lxl) 

If x > 0.0 

x = 2 k, f where .5 < f < 1.0 

and g and n are defined so that 

f = 2" n *g where 1/V2~ < g < \/2~ 
Then DLOG10(x) = log 10 (e)-log e (x) = log e (x)/log e (10) 
log e (g) is evaluated by defining 
s - (g -l)/(g+l) and 
z = 2-s 
and then calculating 

l°g e (s) = l°g e ((l+ z /2)/(l -z/2)) using a minimax 
rational approximation. 
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Error Conditions 

1. If the argument is equal to 0.0, the following message is issued and the 
result is set to -machine infinity. 

DLOG10: Arg is zero; result = -infinity 

2. If the argument is less than 0.0, the following message is issued and the 
absolute value of the argument is used. 

DLOG10: Negative arg; result = DLOG10(ABS(arg)) 
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GLOG 



Description 

The GLOG routine calculates the double-precision, G-floating-point natural 
logarithm of its double-precision, G-floating-point argument. That is: 

GLOG(x) = log e (x) 

Routines Called 

GLOG calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, G-floating-point value greater than 
0.0. 

Type of Result 

The result returned is a double-precision, G-floating-point value in the range 
-710.475 to 709.089. 



Accuracy of Result 

test interval: 


0.00000 through 256.00 




MRE: 


5.13X10 18 (57.4 bits) 




RMS: 


1.26X10 18 (59.5 bits) 




LSB error distribution: 


-2-1.0 +1 

0% 10% 74% 16% 


+2 
0% 



Algorithm Used 

GLOG(x) is calculated as follows. 

If x = 0.0 

GLOG(x) = machine infinity 

If x < 0.0 

GLOG(x) = GLOG(lxl) 

If x > 0.0 

x = 2 k, f where .5 < f < 1.0 
and g and n are defined so that 
f = 2 -n *g where 1/vS < g < V2 

Then GLOG(x) = (k-n)-log e (2)+log e (g) 
l°g e (g) is evaluated by defining 

s = (g-l)/(g+l) and 

z = 2-s 
and then calculating 

loge(g) = log e ((l+z/2)/(l-z/2)) 

using a minimax rational approximation. 
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Error Conditions 

1. If the argument is equal to 0.0, the following message is issued and the 
result is set to -machine infinity. 

GLOG: Arg is zero; result = -infinity 

2. If the argument is negative, the following message is issued and the abso- 
lute value of the argument is used. 

GLOG: Negative arg; result = GLOG(ABS(arg)) 
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GLOG10 



Description 

The GLOG10 routine calculates the double-precision, G-floating-point base- 
10 logarithm of its double-precision, G-floating-point argument. That is: 

GLOGlO(x) = log 10 (x) 

Routines Called 

GLOG10 calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, G-floating-point value greater than 
0.0. 

Type of Result 

The result returned is a double-precision, G-floating-point value in the range 
-308.555 to 307.953. 



Accuracy of Result 

test interval: 


2.78134x10 a09 through 256.00 


MRE: 


6.05x10 18 (57.2 bits) 


RMS: 


1.42xl0 18 (59.3 bits) 


LSB error distribution: 


-2 -1 +1 +2 
1% 18% 62% 18% 0% 



Algorithm Used 

GLOG10(x) is calculated as follows. 

If x = 0.0 

GLOG10(x) = -machine infinity 

If x < 0.0 

GLOGlO(x) = GLOGlO(lxl) 

If x > 0.0 

x =' 2 k *f where .5 < f < 1.0 
and g and n are defined so that 
f = 2~ n *g where 1/V2 < g < V^T 

Then GLOG10(x) = log 10 (e)-log e (x) - log e (x)/log e (10) 
log e (g) is evaluated by defining 

s = (g-l)/g+l) and 

z = 2-s 
and then calculating 

loge(g) = log e ((l+z/2)/(l-z/2)) 

using a minimax rational approximation. 
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Error Conditions 

1. If the argument is equal to 0.0, the following message is issued and the 
result is set to -machine infinity. 

GLOG10: Arg is zero; result = -infinity 

2. If the argument is negative, the following message is issued and the abso- 
lute value of the argument is used. 

GLOG10: Negative arg; result = GLOG10(ABS(arg)) 
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CLOG 



Description 

The CLOG routine calculates the complex, single-precision, floating-point 
natural logarithm of its complex, single-precision, floating-point argument. 
That is: 

CLOG(z) = log e (z) 

Routines Called 

CLOG calls the ALOG, ATAN, ATAN2, and MTHERR routines. 

Type of Argument 

The argument must be a complex, single-precision, floating-point value, both 
parts of which cannot be equal to 0.0, although either can be equal to 0.0. 

Type of Result 

The result returned is a complex, single-precision, floating-point value. The 
real part of the result is in the range -89.415 to 88.029; the imaginary part is in 
the range -w to ir. 



Accuracy of Result 

test interval: 

MRE: 
RMS: 



-1000.0 through 1000.0 real 
-100.00 through 100.00 imaginary 

5.30X10- 5 (14.2 bits) real 
1.49X10- 8 (26.0 bits) imaginary 

1.06X10 7 (23.2 bits) real 
3.44xl0- 9 (28.1 bits) imaginary 



~-4 + -3 -2 -1 +1 +2 
LSB error distribution: 1% 1% 1% 6% 82% 7% 1% real 

0% 0% 0% 3% 94% 3% 0% imaginary 

Algorithm Used 

CLOG(z) is calculated as follows. 

Let z = x+i*y 

If x = 0.0 and y ** 0.0 

CLOG(z) = (^infinity, 0.0) 

If x = 0.0 and y * 0.0 

CLOG(z) = log e (lyl)+i-sgn(y)-7r/2 
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If x ^ 0.0 and y = 0.0 

If x > 0.0 

CLOG(z) = log e (x)+i-0.0 

If x < 0.0 

CLOG(z) = log e (lxl)+i-7r 

If x * 0.0 and y =£ 0.0 
CLOG(z) = u+i-v 
u = .5*log e (x 2 +y 2 ) 
v = tan _1 (y/x) 

Scaled values are calculated on occurences of overflow/underflow 
for (x 2 ,y 2 ) or (x 2 +y 2 ) and propagated to give a valid in-range result 
for u. 

Error Conditions 

1. If both parts of the argument equal 0.0, the following message is issued 
and the result is set to (+infinity, 0.0). 

CLOG; Arg is zero; result = (+infinity, zero) 

2. If either part of the result underflows, one or both of the following mes- 
sages are issued and the relevant part of the result is set to 0.0. 

CLOG: Real part underflow 
CLOG: Imaginary part underflow 
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CDLOG 



Description 

The CDLOG subroutine calculates the complex, double-precision, D-floating- 
point natural logarithm of its complex, double-precision, D-floating-point ar- 
gument. That is: 

CDLOG(z,r) = log e (z) 

z = location of input value 
r = location of result 

Routines Called 

CDLOG calls the DLOG, DATAN, DATAN2, and MTHERR routines. 

Type of Argument 

CDLOG is a subroutine that is called with two arguments. Both arguments 
must be two-element, double-precision vectors. The first vector (z) contains 
the input value; the second vector (r) will contain the result. The real part of 
the input value must be stored in the first element of z; the imaginary part 
must be stored in the second element of z. The input value must be a com- 
plex, double-precision, D-floating-point value, both parts of which cannot be 
equal to 0.0, although either can be equal to 0.0. 

Type of Result 

The result returned is a complex, double-precision, D-floating-point value. 
The real part of the result is in the range -89.415 to 88.376; the imaginary part 
is in the range -it to tt. The result is returned in the second vector (r) supplied 
in the call. The real part of the result is returned in the first element of r; the 
imaginary part is returned in the second element of r. 



Accuracy of Result 

test interval: 

MRE: 
RMS: 



-1000.0 through 1000.0 real 
-100.00 through 100.00 imaginary 

9.07xl0 16 (50.0 bits) real 
5.09X10" 19 (60.8 bits) imaginary 

1.59X10- 18 (59.1 bits) real 
1.04xl0" 19 (63.1 bits) imaginary 



LSB error distribution: 



-4 + 


-3 


-2 


-1 


+1 


+2 


1% 


1% 


1% 


5% 


84% 6% 


1% real 


0% 


0% 


0% 


4% 


92% 4% 


0% imaginary 
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Algorithm Used 

CDLOG is calculated as follows. 

Let z = x+i*y 

If x = 0.0 and y = 0.0 

CDLOG(z) = (+infinity, 0.0) 

If x = 0.0 and y * 0.0 

CDLOG(z) = log e (lyl)+i-sgn(y) -tt/2 

If x* 0.0 and y = 0.0 
If x > 0.0 

CDLOG(z) = log e (x)+i-0.0 
If x < 0.0 

CDLOG(z) - log e (lxl)+i-7r 

If x * 0.0 and y *= 0.0 
CDLOG(z) = u+i-v 
u = .5-log e (x 2 +y 2 ) 
v = tan~ L (y,x) 

Scaled values are calculated on occurrences of overflow/ 
underflow for (x 2 , y 2 ) or (x 2 +y 2 ) and progagated to give a valid in- 
range result for u. 

Error Conditions 

1. If both parts of the argument equal 0.0, the following message is issued 
and the result is set to (+infinity, 0.0). 

CDLOG: Arg is zero; result = (+infinity, zero) 

2. If either part of the result underflows, one or both of the following mes- 
sages are issued and the relevant part of the result is set to 0.0. 

CDLOG: Imaginary part underflow 
CDLOG: Real part underflow 
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CGLOG 



Description 

The CGLOG subroutine calculates the complex, double-precision, G-floating- 
point natural logarithm of its complex, double-precision, G-floating-point ar- 
gument. That is: 

CGLOG(z,r) = log e (z) 

z = location of input value 
r = location of result 

Routines Called 

CGLOG calls the GLOG, GATAN, GATAN2, and MTHERR routines. 

Type of Argument 

CGLOG is a subroutine that is called with two arguments. Both arguments 
must be two-element, double-precision vectors. The first vector (z) contains 
the input value; the second vector (r) will contain the result. The real part of 
the input value must be stored in the first element of z; the imaginary part 
must be stored in the second element of z. The input value must be a com- 
plex, double-precision, G-floating-point value, both parts of which cannot be 
equal to 0.0, although either can be equal to 0.0. 

Type of Result 

The result returned is a complex, double-precision, G-floating-point value. 
The real part of the result is in the range -710.475 to 709.436; the imaginary 
part is in the range -7r to x. The result is returned in the second vector (r) 
supplied in the call. The real part of the result is returned in the first element 
of r; the imaginary part is returned in the second element of r. 



Accuracy of Result 

test interval: 

MRE: 
RMS: 



-1000.0 through 1000.0 real 
-100.00 through 100.00 imaginary 

7.15x10 15 (47.0 bits) real 
3.54x10 18 (58.0 bits) imaginary 

1.77x10 17 (55.7 bits) real 
8.19xl0 19 (60.1 bits) imaginary 



LSB error distribution: 



-4 + 


-3 


-2 


-1 


+1 


+2 


1% 


0% 


1% 


5% 


86% 6% 


1%) real 


0% 


0% 


0% 


4% 


92% 4% 


0% imaginary 
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Algorithm Used 

CGLOG(z) is calculated as follows. 

Let z = x+i*y 

If x = 0.0 and y = 0.0 

CGLOG(z) = +machine infinity 

If x = 0.0 and y * 0.0 

CGLOG(g) = log e (lyl) + i-sgn(y)-7r/2 

If x * 0.0 and y = 0.0 

If x > 0.0 

CGLOG(z) = log e (x)+i-0.0 

If x < 0.0 

CGLOG(z) = log e (lxl)+i-7r 

If x * 0.0 and y * 0.0 
CGLOG(z) = u+i-v 
u = .5*log e (x 2 + y 2 ) 
v = tan _1 (y/x) 

Scaled values are calculated on occurrence of overflow/underflow 
for (x 2 , y 2 ) or (x 2 +y 2 ) and propagated to give a valid in-range result 
for u. 

Error Conditions 

1. If both parts of the argument equal 0.0, the following message is issued 
and the result is set to (+machine infinity, 0.0). 

CGLOG: Arg is zero; result = (+infinity, zero) 

2. If either part of the result underflows, one or both of the following mes- 
sages are issued and the relevant part of the result is set to 0.0. 

CGLOG: Real part underflow 
CGLOG: Imaginary part underflow 
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Chapter 4 

Exponential and Exponentiation Routines 



EXP 



Description 

The EXP routine calculates the single-precision, floating-point exponential 
function of its single-precision, floating-point argument. That is: 

EXP(x) = e x 

Routines Called 

EXP calls the MTHERR routine. 

Type of Argument 

The argument must be a single-precision, floating-point value in the range 
-89.4159863 to 88.0296919. 

Type of Result 

The result returned is a single-precision, floating-point value greater than 
zero. 



Accuracy of Result 

test interval: 


-89.000 through 88.000 




MRE: 


1.74x10* (25.8 bits) 




RMS: 


3.98xl0 9 (27.9 bits) 




LSB error distribution: 


-2 -1 +1 

0% 2% 86% 12% 


+2 
0% 



Algorithm Used 

EXP(x) is calculated as follows. 

If x < -89.4159863 
EXP(x) = 0.0 

If x > 88.0296919 

EXP(x) = +machine infinity 

Otherwise, the argument is reduced as follows: 
Let n = the nearest integer to x/log e (2) 
The reduced argument is: 
g = x-n«log e (2) 

The calculation is: 

EXP(x) = R(g)-2 (n+1) 

R(g) = .5+g«p/(q-g*p) 
p = pl*g 2 +.25 
q = ql«g 2 +.5 

pi = .00416028863 
ql - .0499871789 
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Error Conditions 

1. If the argument is less than -89.4159863, the following message is issued 
and the result is set to 0.0. 

EXP: Result underflow 

2. If the argument is greater than 88.0296919, the following message is issued 
and the result is set to + machine infinity. 

EXP: Result overflow 
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DEXP 



Description 

The DEXP routine calculates the double-precision, D-floating-point exponen- 
tial function of its double-precision, D-floating-point argument. That is: 

DEXP(x) = e x 

Routines Called 

DEXP calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, D-floating-point value in the range 
-89.415986292232944914 to 88.029691931113054295. 

Type of Result 

The result returned is a double-precision, D-floating-point value greater than 
zero. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-89.000 through 88.000 
4.89X10 19 (60.8 bits) 
1.17x10 19 (62.9 bits) 

-2 -1 +1 +2 

0% 2% 86% 12% 0% 



Algorithm Used 

DEXP(x) is calculated as follows. 

If x < -89.415986292232944914 
DEXP(x) = 0.0 

If x > 88.029691931113054295 

DEXP(x) = +machine infinity 

Otherwise, the argument is reduced as follows: 
Let xl = [x], the greatest integer in x 
x2 = x-xl 
n = the nearest integer to x/log e (2) 

The reduced argument is: 
g = xl-n*cl+x2+n*c2 
cl = .543 8 
c2 = loge(2)-.543 8 
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The calculation is: 

DEXP(x) = R(g)-2 (n+1) 
R(g) = .5+g-p/(q-g-p) 

p = (((p2*g 2 +pl)*g 2 )+p0)'g 2 
q = ((((q3-g 2 +q2)-g 2 )+ql)-g 2 )+q0 
pO = .250 

pi = .757531801594227767xl(T 2 
p2 = .315551927656846464xl0~ 4 
q0 = .5 

ql = .568173026985512218X10" 1 
q2 = .631218943743985036xl(T 3 
q3 = .751040283998700461xl0~ 6 

Error Conditions 

1. If the argument is less than -89.415986292232944914, the following mes- 
sage is issued and the result is set to 0.0. 

DEXP: Result underflow 

2. If the argument is greater than 88.029691931113054295, the following mes- 
sage is issued and the result is set to +machine infinity. 

DEXP: Result overflow 
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GEXP 



Description 

The GEXP routine calculates the double-precision, G-floating-point exponen- 
tial function of its double-precision, G-floating-point argument. That is: 

GEXP(x) = e x 

Routines Called 

GEXP calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, G-floating-point value in the range 
-710.475860073943942 to 709.089565712824051. 

Type of Result 

The result returned is a double-precision, G-floating-point value greater than 
or equal to zero. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-89.000 through 88.000 
3.99X10 18 (57.8 bits) 
9.40X10 19 (59.9 bits) 



-2 

0% 



-1 

2% 





85% 



+1 
13% 



+2 
0% 



Algorithm Used 

GEXP(x) is calculated as follows. 

If x < -710.475860073943942 
GEXP(x) = 0.0 

If x > 709.089565712824051 

GEXP(x) = ^machine infinity 

Otherwise, the argument is reduced as follows: 
Let xl = [x], the greatest integer in x 
x2 = x-xl 
n = the nearest integer to x/log e (2) 

The reduced argument is: 
g = xl-n*cl+x2+n # c2 
cl = .543 8 
c2 = log e (2)-.543 8 
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The calculation is: 

GEXP(x) = R(g)-2 (n+1) 
R(g) - .5+g'p/(q-g'p) 

p = (((p2'g 2 +pl)-g 2 )+p0)-g 2 
q = ((((q3-g 2 +q2)-g 2 )+ql)-g 2 )+q0 
pO = .250 

pi = .757531801594227767xlCT 2 
p2 = .315551927656846464X10 4 
qO = .5 

ql = .568173026985512218X10 1 
q2 = .631218943743985036xlCT 3 
q3 = .75104Q283998700461X10- 6 

Error Conditions 

1. If the argument is less than or equal to -710.475860073943942, the follow- 
ing message is issued and the result is set to 0.0. 

GEXP: Result underflow 

2. If the argument is greater than 709.089565712824051, the following mes- 
sage is issued and the result is set to +machine infinity. 

GEXP: Result overflow 
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CEXP 



Description 

The CEXP routine calculates the complex, single-precision, floating-point 
exponential function of its complex, single-precision, floating-point argument. 
That is: 

CEXP(z) = e z 

Routines Called 

CEXP calls the EXP, COS, SIN, and MTHERR routines. 

Type of Argument 

The argument must be a complex, single-precision, floating-point value in the 
range -89.4159863 to 176.0593838 for the real part and less than 823549.66 for 
the imaginary part. 

Type of Result 

The result returned is a complex, single-precision, floating-point value; it may 
be any such value. 



-40.000 through 12.000 real 
-10.000 through 157.08 imaginary 

2.77xl0 8 (25.1 bits) real 
2.88x10 8 (25.0 bits) imaginary 

6.51xl0 9 (27.2 bits) real 
6.38xl0~ 9 (27.2 bits) imaginary 



-2 


-1 





+1 


+2 


1% 


19% 


58% 


21% 


1% real 


1% 


17% 


59% 


23% 


1% imaginary 



Accuracy of Result 

test interval: 

MRE: 
RMS: 

LSB error distribution: 



Algorithm Used 

CEXP(z) is calculated as follows. 

Let z = x+i»y 

If lyl > 823549.66 

CEXP(z) = (0.0,0.0) 

If x < -89.4159863 

CEXP(z) = (0.0,0.0) 

If x > 88.0296919 and y = 0.0 
CEXP(z) = (+infinity, 0.0) 

If 88.0296919 < x < 176.0593838 

and a component of the result is out of range, 
that component is set to + infinity. 

If x > 176.0593838 and y * 0.0 

CEXP(z) = (± infinity, ± infinity) 

Otherwise 

CEXP(z) = e x -(cos(y)+i-sin(y)) 
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Error Conditions 

The following table gives the possible error conditions and the resulting error 
messages. 

Error Conditions for CEXP 



Real Part 


Imaginary Part 






of Argument 


of Argument 


Result 


Error Message(s) 


Any Value 


> 823549.66 


(0.0,0.0) 


#1 


< -89.4159863 


0.0 


(0.0,0.0) 


n 




Not 0.0 and 


(0.0,0.0) 


#2 and #3 




< 823549.66 






Between 


Not 0.0 and 


Underflow may 


None or #2 


-89.41598663 


< 823549.66 


occur on neither, 


or #3 or 


and 88.0296919 




either, or both 
parts 


#2 and #3 


> 88.0296919 


0.0 


(+infinity, 0.0) 


#4 


> 176.0593838 


Not 0.0 and 


(± infinity, 


#4 and #5 




< 823549.66 


± infinity) 




Between 


Not 0.0 and 


Overflow may oc- 


None or #4 


88.0296919 and 


< 823549.66 


cur on neither, ei- 


or #5 or 


176.0593838 




ther, or both 
parts 


H and #5 


Error Messages: 








1. CEXP:ABS(IM. 


AG (are)) too large; 


result = zero 





2. CEXP: Real part underflow 

3. CEXP: Imaginary part underflow 

4. CEXP: Real part overflow 

5. CEXP: Imaginary part overflow 
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CDEXP 



Description 

The CDEXP subroutine calculates the complex, double-precision, D-floating- 
point exponential function of its complex, double-precision, D-floating-point 
argument. That is: 



CDEXP(z,r) 
z 



= location of input value 
= location of result 



Routines Called 

CDEXP calls the DEXP, DSIN, DCOS, and MTHERR routines. 

Type of Argument 

CDEXP is a subroutine that is called with two arguments. Both arguments 
must be two-element, double-precision vectors. The first vector (z) contains 
the input value; the second vector (r) will contain the result. The real part of 
the input value must be stored in the first element of z; the imaginary part 
must be stored in the second element of z. The input value must be a complex 
double-precision, D-floating-point value in the range -89.415986292232944914 
to 176.059383862226109 for the real part and less than 6746518850.429 for the 
imaginary part. 

Type of Result 

The result returned is a complex, double-precision, D-floating-point value. It 
is returned in the second vector (r) supplied in the call. The real part of the 
result is returned in the first element of r; the imaginary part is returned in 
the second element of r. 



Accuracy of Result 

test interval: 

MRE: 
RMS: 

LSB error distribution: 



-40.000 through 12.000 real 
-10.000 through 157.08 imaginary 

8.78xl0 19 (60.0 bits) real 
9.49xl0 19 (59.9 bits) imaginary 

1.90xl0 19 (62.2 bits) real 
1.87xl0~ 19 (62.2 bits) imaginary 



-2 


-1 





+1 


+2 


1% 


23% 


57% 


18% 


1% real 


1% 


20% 


59% 


19% 


1% imaginary 
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Algorithm Used 

CDEXP is calculated as follows. 

Let z = x+i*y 

If lyl > 6746518850.429 
CDEXP(z) = (0.0,0.0) 

If x < -89.415986292232944914 
CDEXP(z) = (0.0,0.0) 

If x > 88.029691931113054295 and y = 0.0 
CDEXP(z) = (+infiiiity, 0.0) 

If 88.029691931113054295 < x < 176.059383862226109 
and a component of the result is out of range, 
that component is set to +infinity. 

If x > 176.059383862226109 and y * 0.0 
CDEXP(z) = (± infinity, ± infinity). 

Otherwise 

CDEXP(z) = e x -(cos(y) fi-sin(y)) 

Error Conditions 

The following table gives the possible error conditions and the resulting error 
messages. 

Error Conditions for CDEXP 



Real Part 
of Argument 


Imaginary Part 
of Argument 


Result 


Error Message(s) 


Any Value 


> 6746518850.429 


(0.0,0.0) 


§1 


< -89.415986292232944914 


0.0 


(0.0,0.0) 


W 




Not 0.0 and 

< 6746518850.429 


(0.0,0.0) 


#2 and §3 


Between 

-89.415986292232944914 
and 88.02969193113054295 


Not 0.0 and 

< 6746518850.429 


Underflow may 
occur on neither, 
either, or both 
parts 


None or #2 
or §3 or 
#2 and #3 


> 88.02969193113054295 


0.0 


(+infinity, 0.0) 


n 


> 176.059383862226109 


Not 0.0 and 

< 6746518850.429 


(± infinity, 
± infinity) 


#4 and #5 


Between 

88.02969193113054295 and 
176.059383862226109 


Not 0.0 and 

< 6746518850.429 


Overflow may oc- 
cur on neither, ei- 
ther, or both 
parts 


None or #4 
or #5 or 

#4 and #5 


Error Messages: 









1. CDEXP: ABS(IMAG(arg)) too large; result = zero 

2. CDEXP: Real part underflow 

3. CDEXP: Imaginary part underflow 

4. CDEXP: REAL(arg) too large; REAL(result) = + infinity 

5. CDEXP: REAL(arg) too large; IMAG(result) = +infinity 
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CGEXP 



Description 

The CGEXP subroutine calculates the complex, double-precision, G-floating- 
point exponential function of its complex, double-precision, G-floating-point 
argument. That is: 

CGEXP(z,r) =e z 

z = location of input value 
r = location of result 

Routines Called 

CGEXP calls the GEXP, GSIN, GCOS, and the MTHERR routines. 

Type of Argument 

CGEXP is a subroutine that is called with two arguments. Both arguments 
must be two-element, double-precision vectors. The first vector (z) contains 
the input value; the second vector (r) will contain the result. The real part of 
the input value must be stored in the first element of z; the imaginary part 
must be stored in the second element of z. The input value must be a com- 
plex, double-precision, G-floating-point value in the range 
-710.475860073943942 to 1418.179131425648102 for the real part and less than 
1686629713.065 for the imaginary part. 

Type of Result 

The result returned is a complex, double-precision, G-floating-point value. It 
is returned in the second vector (r) supplied in the call. The real part of the 
result is returned in the first element of r; the imaginary part is returned in 
the second element of r. 



Accuracy of Result 

test interval: 

MRE: 
RMS: 

LSB error distribution: 



-40.000 through 12.000 real 
-10.000 through 157.08 imaginary 

6.50xl0 18 (57.1 bits) real 
6.67xl0~ 18 (57.1 bits) imaginary 

1.53xl0 18 (59.2 bits) real 
1.44xl0~ 18 (59.3 bits) imaginary 



-2 


-1 





+1 


+2 


1% 


19% 


57% 


22% 


1% real 


0% 


16% 


60% 


22% 


1% imaginary 
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Algorithm Used 

CGEXP(z) is calculated as follows. 

Let z = x+i*y 
If lyl > 1686629713.065 
CGEXP(z) = (0.0,0.0) 

If x < -710.475860073943942 
CGEXP(z) = (0.0,0.0) 

If x > 709.089565 and y = 0.0 
CGEXP(z) = (-.-infinity, 0.0) 

If 709.089565 < x < 1418.179131425648102 

and a component of the result is out of range, 
that component is set to +infinity. 

If x > 1418.179131425648102 and y * 0.0 
CGEXP(z) = (±infinity, ±infinity) 

Otherwise 

CGEXP(z) = e x -(cos(y)+i-sin(y)) 

Error Conditions 

The table below shows the possible values of the argument that could cause 
error conditions. 

Error Conditions for CGEXP 



Real Part 


Imaginary Part 






of Argument 


of Argument 


Result 


Error Messages 


Any value 


> 1686629713.065 


(0.0,0.0) 


#1 


< -710.475860073943942 


0.0 


(0.0,0.0) 


n 




Not 0.0 and 


(0.0,0.0) 


W and #3 




< 1686629713.065 






Between 


Not 0.0 and 


Underflow may 


None or #2 or #3 


-710.475860073943942 


< 1686629713.065 


occur on neither, 


or #2 and #3 


and 709.089565 




either, or both 
parts 




> 709.089565 


0.0 


(infinity, 0.0) 


#4 


> 1418.179131425648102 


Not 0.0 and 


( ± infinity, 


H and §5 




< 1686629713.065 


± infinity) 




Between 


Not 0.0 and 


Overflow may oc- 


None or #4 or #5 


709.089565 and 


< 1686629713.065 


cur on neither, ei- 


or #4 and #5 


1418.179131425648102 




ther, or both 
parts 




Error Messages: 









1. CGEXP: ABSdMAG(arg)) too large; result = zero 

2. CGEXP: Real part underflow 

3. CGEXP: Imaginary part underflow 

4. CGEXP: REAL(arg) too large; REAL(result) = +infinity 

5. CGEXP: REAL(arg) too large; IMAG(result) = +infinity 
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EXP1 



Description 

The EXP1. routine raises one integer to the power of another integer. That is: 

EXPl.(m,n) = m n 

Routines Called 

EXP1. calls the MTHERR routine. 

Type of Arguments 

The two arguments must be integer values; they can be any such values. 

Type of Result 

The result returned is an integer value; it may be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

EXPl.(m,n) is calculated as shown in the following table. 

Calculations for EXP1. 



Value of m Value of n 



Result 



*0 





1 














>0 








<0 


+infinity 


+1 


any value 


1 


-1 


even 


1 


-1 


odd 


-1 


*±1 


<o 





#±1 


>0 


m 



Error Conditions 

1. If the exponent is too large a number, the following message is issued and 
the result is set to ± infinity. 

EXP1.: Result overflow 

2. If both the base and the exponent are 0, the following message is issued 
and the result is set to 0. 

EXP1.: Zero**zero is indeterminate, result = zero 
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EXP2. 



Description 

The EXP2. routine raises a single-precision, floating-point number to the 
power of an integer. That is: 

EXP2.(x,n) = x n 

Routines Called 

EXP2. calls the MTHERR routine. 

Types of Arguments 

There are two arguments. The base must be a single-precision, floating-point 
value, and the exponent must be an integer value. They can be any such 
values. 

Type of Result 

The result returned is a single-precision, floating-point value; it may be any 
such value. 



Accuracy of Result 

test interval 

x n 

.50000 through 1.0000 2 

.50000 through 1.0000 -5 

.50000 through 1.0000 9 

.50000 through 1.0000 -12 

.50000 through 1.0000 15 

.50000 through 1.0000 -20 

.50000 through 1.0000 40 

total 



MRE 



-.-9 



7.45xl0" a (27.0 bits) 
3.07xl0~ 8 (25.0 bits) 
5.53X10" 8 (24.1 bits) 
7.91xl0" 8 (23.6 bits) 
9.08X10" 8 (23.4 bits) 
1.27xl0~ 7 (22.9 bits) 
2.65xl0~ 7 (21.8 bits) 
2.65X10" 7 (21.8 bits) 



LSB error distribution according to the value of n 



n = 2 
n = -5 
n = 9 



n = -12 
n - 15 
n = -20 
n = 40 



total 



-4 + 
0% 

0% 

1% 

7% 

9% 

20% 

34% 

10% 



-3 

0% 

0% 
4% 
8% 
9% 
8% 
4% 
5% 



-2 
0% 



-1 

0% 



5% 24% 

13% 21% 

13% 15% 



12% 
9% 
5% 
8% 



13% 
9% 

5% 
12% 




100% 

41% 

23% 

15% 

13% 

9% 

5% 

29% 



+1 
0% 

25% 

21% 

15% 

13% 

9% 

5% 

12% 



RMS 

3.48xl0" 9 (28.1 bits) 
8.88X10" 9 (26.7 bits) 
1.61X10" 8 (25.9 bits) 
2.37xl0~ 8 (25.3 bits) 
2.70xl0~ 8 (25.1 bits) 
3.95xl0" 8 (24.6 bits) 
7.87xl0~ 8 (23.6 bits) 
3.67xl0" 8 (24.7 bits) 



+2 
0% 



+3 
0% 



5% 0% 
13% 4% 



12% 

12% 
9% 
5% 
8% 



8% 
9% 
8% 

5% 
5% 



+4 + 
0% 

0% 

1% 

7% 

9% 

20% 

34% 

10% 
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Algorithm Used 

EXP2.(x,n) is calculated as shown in the following table. 

Calculations for EXP2. 



Value of x 


Value of n 


Result 


*0.0 





1.0 


0.0 





0.0 


0.0 


>0 


0.0 


0.0 


<o 


+ infinity 


>0.0 


>0 


n 
X 



Error Conditions 

1. If the exponent has sufficiently large magnitude, overflow occurs in one of 
the following ways: 



Base 


Exponent 


Result 


>1.0 


positive 


+ infinity 


<-1.0 


positive, even 


+infinity 




positive, odd 


-infinity 


0.0 to 1.0 


negative 


+ infinity 


-1.0 to 0.0 


negative, even 


+ infinity 




negative, odd 


-infinity 



and the following message is issued. 
EXP2.: Result overflow 

2. If the exponent has sufficiently large magnitude, underflow occurs in one 
of the following ways: 

Magnitude of Base Exponent Result 

> 1.0 negative 0.0 

< 1.0 positive 0.0 

and the following message is issued. 
EXP2.: Result underflow 

3. If both the exponent and the base are zero, the following message is issued 
and a result of zero is returned. 

EXP2.: Zero**zero is indeterminate, result = zero 
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DEXP2. 



Description 

The DEXP2. routine raises a double-precision, D -floating-point number to 
the power of an integer. That is: 

DEXP2.(x,n) = x n 

Routines Called 

DEXP2. calls the MTHERR routine. 

Type of Arguments 

There are two arguments. The base must be a double-precision, D-floating- 
point value, and the exponent must be an integer value. They can be any such 
values. 

Type of Result 

The result returned is a double-precision, D-floating-point value; it may be 
any such value. 

Accuracy of Result 



test interval 
x 

.50000 through 1.0000 

.50000 through 1.0000 

.50000 through 1.0000 

.50000 through 1.0000 

.50000 through 1.0000 

total 



MRE 



RMS 



n 

2 

-9 

12 

15 

-40 



-19 



2.16xl0~ ls (62.0 bits) 



-18 



1.62xl0~ iO (59.1 bits) 
2.27xl0" 18 (58.6 bits) 
2.73X10" 18 (58.3 bits) 
7.50X10" 18 (56.9 bits) 

-18 



7.50x10"*° (56.9 bits) 
LSB error distribution according to the value of n 



n = 2 
n = -9 
n = 12 
n = 15 
n= -40 



total 



-4 + 
0% 

1% 

6% 

9% 

34% 

10% 



-3 

0% 

4% 
8% 
9% 
4% 
5% 



-2 

0% 

12% 
12% 
12% 

5% 
8% 



-1 
0% 

20% 

15% 

13% 

4% 
11% 




100% 

23% 

16% 

13% 

5% 

31% 



+1 

0% 

20% 

15% 

13% 

5% 

11% 



-19 



l.OlxlO" 1 " (63.1 bits) 
4.72xl0~ 19 (60.9 bits) 
6.79xl0~ 19 (60.4 bits) 
7.89xl0 -19 (60.1 bits) 
2.31X10" 18 (58.6 bits) 
1.15X10" 18 (59.6 bits) 



+2 
0% 

12% 

13% 

12% 

4% 
8% 



+3 
0% 

5% 

9% 

9% 

4% 

5% 



+4 + 
0% 

2% 

6% 

9% 

34% 

10% 
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Algorithm Used 

DEXP2.(x,n) is calculated as shown in the following table. 

Calculations for DEXP2. 



Value of x 


Value of n 




Result 


*0.0 







1.0 


0.0 







0.0 


0.0 


>0 




0.0 


0.0 


<o 




+infinity 


>0.0 


>0 




n 
X 


Error Conditions 






1 . If the exponent has sufficiently large 
the following ways: 


Base 


Exponent 


Result 


>1.0 


positive 




+infinity 


<-1.0 


positive, 
positive, 


even 
odd 


+ infinity 
-infinity 


0.0 to 1.0 


negative 




+ infinity 


-1.0 to 0.0 


negative, 
negative, 


even 
odd 


+infinity 
-infinity 



and the following error message is issued. 

DEXP2.: Result overflow 

If the exponent has sufficiently large magnitude, underflow occurs in one 
of the following ways: 

Magnitude of Base Exponent Result 

> 1.0 negative 0.0 

< 1.0 positive 0.0 

and the following message is issued. 

DEXP2.: Result underflow 

If both the exponent and the base are zero, the following message is issued 
and the result is set to zero. 

DEXP2.: Zero**zero is indeterminate, result = zero 
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GEXP2. 



Description 

The GEXP2. routine raise a double-precision, G-floating-point number to the 
power of an integer. That is: 

GEXP2.(x,n) = x n 

Routines Called 

GEXP2. calls the MTHERR routine. 

Type of Arguments 

There are two arguments. The base must be a double-precision, G-floating- 
point value; it can be any such value. The exponent must be an integer value; 
it can be any such value. 

Type of Result 

The result returned is a double-precision, G-floating-point value; it may be 
any such value. 

Accuracy of Result 



test interval 

X 




n 


MRE 






RMS 




.50000 through 1.0000 




2 


1.72x10" 


18 (59.0 bits) 


8.11xl0~ 19 (60.1 bits) 


.50000 through 1.0000 




-9 


1.26x10" 


17 (56.1 bits) 


3.79xlC 


I" 18 (57.9 bits) 


.50000 through 1.0000 




12 


1.69x10" 


17 (55.7 bits) 


5.45xlC 


I" 18 (57.3 bits) 


.50000 through 1.0000 




15 


2.13x10" 


17 (55.4 bits) 


6.27xlC 


f 18 (57.1 bits) 


.50000 through 1.0000 




-40 


5.64x10" 


17 (54.0 bits) 


1. 85x10" 17 (55.6 bits) 


total 






5.64x10" 


17 (54.0 bits) 


9.25xl0" 18 (56.6 bits) 


LSB error distribution according to the value of 


n 








-4 + 


-3 


-2 


-1 





+1 


+2 


+3 +4 + 


n = 2 0% 


0% 


0% 


0% 


100% 


0% 


0% 


0% 0% 


n = -9 2% 


5% 


12% 


21% 


23% 


20% 


12% 


4% 1% 


n = 12 6% 


8% 


13% 


16% 


15% 


15% 


13% 


8% 6% 


n = 15 9% 


9% 


12% 


13% 


14% 


13% 


12% 


9% 9% 


n - -40 34% 


4% 


4% 


5% 


4% 


5% 


5% 


4% 34% 


total 10% 


5% 


8% 


11% 


31% 


10% 


8% 


5% 10% 
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Algorithm Used 

GEXP2.(x,n) is calculated as shown in the following table. 

Calculations for GEXP2. 



Value of x 


Value of n 




Result 


*0.0 







1.0 


0.0 







0.0 


0.0 


>0 




0.0 


0.0 


<o 




+ infinity 


>0.0 


>o 




x n 


■rror Conditions 






. If the exponent has sufficiently large 
the following ways: 


Base 


Exponent 


Result 


>1.0 


positive 




+ infinity 


<-1.0 


positive, 
positive, 


even 
odd 


+infinity 
-infinity 


0.0 to 1.0 


negative 




+ infinity 


-1.0 to 0.0 


negative, 
negative, 


even 
odd 


+ infinity 
-infinity 



and the following error message is issued: 
GEXP2.: Result overflow 

2. If the exponent has sufficiently large magnitude, underflow occurs in one 
of the following ways: 

Magnitude of Base Exponent Result 

> 1.0 negative 0.0 

< 1.0 positive 0.0 

and the following message is issued: 
GEXP2.: Result underflow 

3. If both the exponent and the base are zero, the following message is issued 
and the result is set to zero. 

GEXP2.: Zero**zero is indeterminate, result = zero 
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CEXP2. 



Description 

The CEXP2. routine raises a complex, single-precision, floating-point number 
to the power of an integer. That is: 

CEXP2.(z,n) = z n 

Routines Called 

CEXP2. calls the CDLOG, DLOG, DSIN, DCOS, DEXP, and MTHERR 
routines. 

Type of Arguments 

There are two arguments. The base must be a complex, single-precision, 
floating-point value, and the exponent must be an integer. They can be any 
such values. 

Type of Result 

The result returned is a complex, single-precision, floating-point value; it may 
be any such value. 



.50000 through 1.0000 for z (real) 
.50000 through 1.0000 for z (imaginary) 
-10 through 20 for n 

7.45X10" 9 (27.0 bits) real 
7.45xl0~ 9 (27.0 bits) imaginary 

3.17X10 9 (28.2 bits) real 
3.16xl0~ 9 (28.2 bits) imaginary 



Accuracy of Result 

test interval: 

MRE: 
RMS: 

LSB error distribution: 



When the ratio of the imaginary part of the base to the real part is less than 
-10 10 , one part of the result is less accurate. Which part is less accurate 
depends on the exponent. For example: 

-l.OOOOOxlO 10 through -LOOOOOxlO 15 for z (real) 
test interval: -2.0000 through -1.0000 for z (imaginary) 
-1 for n 



-2 


-1 





+1 


+2 


0% 


0% 


100% 


0% 


0% real 


0% 


0% 


100% 


0% 


0% imaginary 



LSB error distribution: 



test interval: 



LSB error distribution: 



-2 

0% 

0% 



-1 

6% 

0% 



+1 

65% 28% 
100% 0% 



+2 

2% real 

0% imaginary 



-LOOOOOxlO" 10 through -LOOOOOxlO" 15 for z (real) 
-2.0000 through -1.0000 for z (imaginary) 
2 for n 



-2 

0% 

6% 



-1 
0% 

27% 



+1 

100% 0% 
60% 8% 



+2 

0% real 

0% imaginary 
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Algorithm Used 

CEXP2.(z,n) is calculated as follows. 

Let z = x+i # y 

First the routine checks for the special cases shown in the following table. 

Special Cases for CEXP2. 



Value of x 


Value of y 


Value of n 


Result 


any value 


any value 


1 


x+i # y 


0.0 


0.0 


<0 


(+infinity, +infinity) 


0.0 


0.0 





(0.0,0.0) 


0.0 


0.0 


>0 


(0.0,0.0) 


not both 0.0 





(1.0,0.0) 



If none of the special cases applies, the routine continues calculations as 
follows. 

The CEXP2. function is evaluated as the complex exponential of 
n-(LNRHO+i-THETA). 

LNRHO is the real part of: 

log e (x+i*y) 
THETA is the imaginary part of: 

log e (x+i«y) 
The real part of n • (LNRHO +i- THETA) is: 

ALPHA = n -LNRHO 
and the imaginary part is: 

PHI = n- THETA 

Since it is ultimately e iPHI that is needed, it would appear that sin(PHI) 
and cos(PHI) are needed. However, these functions will be multiplied by 
e ALPHA , and the handling of exception boundaries on the product will be 
expedited by use of log e (sin(PHI)) and log e (cos(PHI)), which will be added 
to ALPHA before the call to the DEXP function. The absolute values of 
sin(PHI) and cos(PHI) are used as arguments of the CDLOG function; the 
signs of sin(PHI) and cos(PHI) are stored for use in determining the signs 
for the real and imaginary parts of the complex exponential, CEXP. 

The real part of the final result is: 

Sgn(C0s(PHI)) . e ALPHA+1 °ee(l cob(PHI)I) 

The imaginary part of the final result is: 
sgn(sin(PHI))'e ALPHA+log e (lsin(PHI)l) 
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Error Conditions 

The following error messages are returned for error conditions detected during 
the check for the special cases shown above. Other errors detected will result 
in error messages relating to the CEXP3. routine because CEXP2. is part of 
the CEXP3. routine. 

1. If both the real and imaginary parts of the argument are zero and the 
exponent is also zero, the following message is issued and the result is set 
to (0.0,0.0). 

CEXP2.: Zero**zero is indeterminate, result = zero 

2. If both the real and imaginary parts of the argument are zero and the 
exponent is negative, the following message is issued and the result is set 
to (infinity, infinity). 

CEXP2.: Zero** negative exponent, result = infinity 

3. If PHI > 6746518852, argument reduction for sin/cos is impossible so the 
following message is issued and the result is set to (+infinity, +infinity). 

CEXP2.: Both parts indeterminate 

4. If the base and/or the exponent are such that one or both parts of the 
result overflow, one of the following messages is issued and the corre- 
sponding result is set to ± infinity. 

CEXP2.: Real part overflow 
CEXP2.: Imaginary part overflow 
CEXP2.: Both parts overflow 

5. If the base and/or the exponent are such that one or both parts of the 
result underflows, one of the following messages is issued and the corre- 
sponding result is set to 0.0. 

CEXP2.: Real part underflow 
CEXP2.: Imaginary part underflow 
CEXP2.: Both parts underflow 
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EXP3. 



Description 

The EXP3. routine raises a single-precision, floating-point number to the 
power of another single-precision, floating-point number. That is: 

EXP3.(x,y) = x* 

Routines Called 

EXP3. calls the MTHERR routine. 

Type of Arguments 

There are two arguments; both must be single-precision, floating-point val- 
ues. The base must not be less than zero unless the exponent is an integer. 
The base must not be equal to zero unless the exponent is greater than zero. 

Type of Result 

The result returned is a single-precision, floating-point value in the range 

2-129 tQ 2^ 



Accuracy of Result 



MRE 

1.52xl0r 8 (26.0 bits) 
1.86xl0- 8 (25.7 bits) 
2.27x10-8 (25.4 bits) 
3.14xl0- 8 (24.9 bits) 
3.90xl0- 8 (24.6 bits) 
6.18x10-8 (23.9 bits) 
9.04x10-8 (23.4 bits) 
9.04x10-8 (23.4 bits) 

LSB error distribution according to the value of y 



test interval 




X 


y 


.50000 through 1.0000 


5.1 


.50000 through 1.0000 


-10.1 


.50000 through 1.0000 


15.1 


.50000 through 1.0000 


-20.1 


.50000 through 1.0000 


30.1 


.50000 through 1.0000 


-50.1 


.50000 through 1.0000 


80.1 


total 





y = 5.1 
y = -10.1 
y= 15.1 
y = -20.1 
y= 30.1 
y = -50.1 
y= 80.1 
total 



-4 + 
0% 

0% 

0% 

0% 
0% 
0% 

4% 
1% 



-3 
0% 

0% 

0% 

0% 

0% 

0% 

4% 

1% 



-2 
0% 

0% 

0% 

0% 

3% 
3% 

9% 
2% 



-1 

12% 

11% 

18% 
14% 
21% 
17% 
19% 
16% 





74% 

70% 
66% 
61% 
56% 
46% 
36% 
58% 



+1 

14% 

19% 
16% 

24% 
18% 
23% 
19% 
19% 



RMS 

4.70x1a 9 (27.7 bits) 

4.92x10-9 (27.6 bits) 

5.42x10-9 (27.5 bits) 

6.05x10-9 (27.3 bits) 

7.32x10-9 (27.0 bits) 

1.07x10-8 (26.5 bits) 

1.60x10-8 (25.9 bits) 

8.74x10-9 (26.8 bits) 



+2 
0% 

0% 

0% 

1% 
1% 
7% 
6% 
2% 



+3 
0% 

0% 

0% 

0% 

0% 

2% 
2% 
1% 



+4 + 
0% 

0% 

0% 

0%. 

0% 

1% 
1% 

0% 
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Algorithm Used 

EXP3.(x,y) is calculated as follows. 



First the routine checks for the special cases shown in the following table 


Special Cases for EXP3. 




Value of x 


Value of y 


Result 


0.0 


>0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


<0.0 


infinity 


*0.0 


0.0 


1.0 


<0.0 


odd integer 


<0.0 


<0.0 


even integer 


>0.0 


<0.0 


not integer 


(-x) y 



Otherwise 
x^ = 2 W 

w = ylog 2 (x) 

log 2 (x) is calculated as follows: 
x = 2 m -f where .5<f<1.0 
Let p be an odd integer < 16 and 
let a = 2-p /16 
Then select p to minimize la-fl 

now x = 2 m, a*(f/a) 
Then log 2 (x) = m+log 2 (a)+log 2 (f/a) or 
log 2 (x) = m-p/16+log 2 (f/a) 

Let ul = m-p/16 and 

u2 = log 2 (f/a) = log 2 ((l+s)/(l-s)) 
Then log 2 (x) = ul+u2 and 

s = (f-a)/(f+a) 

A rational approximation is used to evaluate u2; ul and u2 are then 
used to determine wl and w2. 
w = ylog 2 (x) = wl+w2 and 

wl = FLO AT(INT(w 16.0) )/16.0 = ml+pl/16 
ml and pi are integers with < pi < 15 

Finally 

If -129 < w < 127 

EXP3.(x,y) = x y = 2 W is reconstructed as: 
EXP3.(x,y) = 2 wl -2 w2 

2 wl is evaluated by table lookup and 2 w2 is evaluated from an- 
other rational approximation. 
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Error Conditions 

1. If the base is a negative value and the exponent is not an integer, the 
following message is issued and the calculation proceeds using the abso- 
lute value of the base. 

EXP3.: Negative base**non-integer; ABS(base) used 

2. If the base is 0.0 and the exponent is negative, the following message is 
issued and the result is set to infinity. 

EXP3.: Zero**negative exponent; result = infinity 

3. If both the base and the exponent are 0.0, the following message is issued 
and the result is set to 0.0. 

EXP3.: Zero**zero is indeterminate; result = zero 

4. If ylog 2 (x) > 127, the result overflows. Then the following message is is- 
sued and the result is set to -infinity if x is less than 0.0 and y is an odd 
integer. Otherwise, the result is set to +infinity. 

EXP3.: Result overflow 

5. If ylog 2 (x) < -129, the result underflows. Then the following message is 
issued and the result is set to 0.0. 

EXP3.: Result underflow 
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DEXP3. 



Description 

The DEXP3. routine raises a double-precision, D-floating-point number to 
the power of another double-precision, D-floating-point number. That is: 

DEXP3.(x,y) = X y 

Routines Called 

DEXP3. calls the MTHERR routine. 

Type of Argument 

There are two arguments; both must be double-precision, D-floating-point 
values. The base must not be less than zero unless the exponent is an integer. 
The base must not be equal to zero unless the exponent is greater than zero. 

Type of Result 

The result returned is a double-precision, D-floating-point value greater than 
or equal to 2 129 and less than or equal to 2 127 . 

Accuracy of Result 



test Interval 

X 


y 




MRE 




RMS 






.50000 through 1.0000 


5.1 




5.23xl0~ 19 (60.7 bits) 


1.45xl<T 19 (62.6 bits) 


.50000 through 1.0000 


-10.1 




5.50X10" 19 (60.7 bits) 


1.46XHT 19 (62.6 bits) 


.50000 through 1.0000 


20.1 




9.07xl0- 19 (59.9 bits) 


1.84X10" 19 (62.2 bits) 


.50000 through 1.0000 


-50.1 




1.97X10" 18 (58.8 bits) 


3.27xl0- 1 9 (61.4 bits) 


.50000 through 1.0000 


80.1 




3.02xl0" 18 (58.2 bits) 


5.10X10" 19 (60.8 bits) 




total 






3.02xl(T 18 (58.2 bits) 


2.98X10" 19 (61.5 bits) 


LSB error distribution according to the value of y 












-4 + 


-3 


-2 


-1 


+1 


+2 


+3 


+4 + 


y= 5.1 


0% 


0% 


0% 


7% 73% 


20% 


0% 


0% 


0% 


y = -10.1 


0% 


0% 


0% 


13% 70% 


17% 


0% 


0% 


0% 


y= 20.1 


0% 


0% 


0% 


11% 63% 


25% 


1% 


0% 


0% 


y = -50.1 


1% 


2% 


6% 


19% 46% 


21% 


4% 


1% 


0% 


y = -80.1 


1% 


2% 


5% 


16% 35% 


22% 


10% 


5% 


5% 


total 


0% 


1% 


2% 


13% 57% 


21% 


3% 


1% 


1% 



Algorithm Used 

DEXP3.(x,y) is calculated as follows. 

First the routine checks for the special cases shown in the following table. 



4-28 TOPS-10/TOPS-20 Common Math Library Reference Manual 



Special Cases for DEXP3. 



Value of x 


Value of y 


Result 


0.0 


>0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


<0.0 


infinity 


*0.0 


0.0 


1.0 


<0.0 


odd integer 


<0.0 


<0.0 


even integer 


>0.0 


<0.0 


not integer 


(-x)y 



Otherwise 

x y = 2 W 

w = ylog 2 (x) 

log2(x) is calculated as follows: 
x = 2 m *f where .5<f<1.0 
Let p be an odd integer < 16 and 
let a = 2 p/16 
Then select p to minimize la-fl 

now x = 2 m, a*(f/a) 
Then log 2 (x) = m+log 2 (a)+log 2 (f/a) or 
log 2 (x) = m-p/16+log 2 (f/a) 

Let ul = m-p/16 and 

u2 = log 2 (f/a) = log 2 ((l+s)/(l-s)) 
Then log 2 (x) = ul+u2 and 
s = (f-a)/(f+a) 

A rational approximation is used to evaluate u2; ul and u2 are then 
used to determine wl and w2. 
w = ylog 2 (x) = wl+ w2 and 

wl = FLOAT(INT(wl6.0))/16.0 = ml+pl/16 

ml and pi are integers with < pi < 15 

Finally 

If -129 < w < 127 

DEXP3.(x,y) - x y = 2 W is reconstructed as: 
DEXP3.(x,y) = 2 wl -2 w2 

2 wl is evaluated by table lookup and 2 w2 is evaluated from an- 
other rational approximation. 
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Error Conditions 

1. If the base is a negative value and the exponent is not an integer, the 
following message is issued and the calculation proceeds using the abso- 
lute value of the base. 

DEXP3.: Negative base**non-integer; ABS(base) used 

2. If the base is 0.0 and the exponent is negative, the following message is 
issued and the result is set to infinity. 

DEXP3.: Zero** negative exponent; result = infinity 

3. If both the base and the exponent are 0.0, the following message is issued 
and the result is set to 0.0. 

DEXP3.: Zero**zero is indeterminate; result = zero 

4. If ylog2(x) > 127, the result overflows. Then the following message is is- 
sued and the result is set to -infinity if x is less than 0.0 and y is an odd 
integer. Otherwise, the result is set to +infinity. 

DEXP3.: Result overflow 

5. If ylog 2 (x) < -129, the result underflows. Then the following message is 
issued and the result is set to 0.0. 

DEXP3.: Result underflow 
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GEXP3. 



Description 

The GEXP3. routine raises a double-precision, G-floating-point number to 
the power of another double-precision, G-floating-point number. That is: 

GEXP3.(x,y) = x v 

Routines Called 

GEXP3. calls the MTHERR routine. 

Type of Arguments 

There are two arguments; both must be double-precision, G-floating-point 
values. The base must not be less than zero unless the exponent is an integer. 
The base must not be equal to zero unless the exponent is greater than zero. 

Type of Result 

The result returned is a double-precision, G-floating-point value in the range 

O-1025 + O1023 

Accuracy of Result 



test Interval 

X 


y 




MRE 




RMS 






.50000 through 


1.0000 


5.10 




3.69X10 -18 (57.9 bits) 


1.18X10" 18 (59.6 bits) 


.50000 through 1.0000 


-10.10 




4.91X10" 18 (57.5 bits) 


1.22xl(H 8 (59.5 bits) 


.50000 through 


1.0000 


20.10 




7.92xl0- 18 (56.8 bits) 


1.49X10- 18 (59.2 bits) 


.50000 through 1.0000 


-50.10 




1.46xl(H 7 (55.9 bits) 


2.70x10-18 (58.4 bits) 


.50000 through 1.0000 


80.10 




2.17xl0" 17 (55.4 bits) 


4.13X10" 18 (57.7 bits) 




total 






2.17xl0" 17 (55.4 bits) 


2.43X10" 18 (58.5 bits) 


LSB error distribution according to the value of y 












-4 + 


-3 


-2 


-1 


+1 


+2 


+3 


+4 + 


y - 5.10 


0% 


0% 


0% 


14% 70% 


16% 


0% 


0% 


0% 


y = -10.10 


0% 


0% 


0% 


12% 68% 


20% 


0% 


0% 


0% 


y = 20.10 


0% 


0% 


1% 


19% 60% 


19% 


1% 


0% 


0% 


y = -50.10 


0% 


1% 


4% 


17% 43% 


24% 


■ 7% 


2% 


1% 


y= 80.10 


4% 


5% 


8% 


18% 34% 


19% 


7% 


3% 


2% 


total 


1% 


1% 


3% 


16% 55% 


20% 


3% 


1% 


1% 



Algorithm Used 

GEXP3.(x,y) is calculated as follows. 

First the routine checks for the special cases shown in the following table. 
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Special Cases for GEXP3. 



Value of x 


Value of y 


Result 


0.0 


>0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


<0.0 


infinity 


*0.0 


0.0 


1.0 


<0.0 


odd integer 


<0.0 


<0.0 


even integer 


>0.0 


<0.0 


not integer 


(-x) y 



Otherwise 
x* = 2 W 

w = ylog 2 (x) 

log2(x) is calculated as follows: 
x = 2 m 'f where .5<f<1.0 
Let p be an odd integer < 16 and 
let a = 2" p/16 
Then select p to minimize la-fl 

now x = 2 m, a*(f/a) 
Then log 2 (x) = m+log 2 (a)+log 2 (f/a) or 
log 2 (x) = m-p/16+log 2 (f/a) 

Let ul = m-p/16 and 

u2 = log 2 (f/a) = log 2 ((l+s)/(l-s)) 
Then log 2 (x) = ul+u2 and 
s = (f-a)/(f+a) 

A rational approximation is used to evaluate u2; ul and u2 are then 
used to determine wl and w2. 
w = ylog 2 (x) = wl+w2 and 

wl = FLOAT(INT(w-16.0))/16.0 = ml+pl/16 

ml and pi are integers with < pi < 15 

Finally 

If -1025 < w < 1023 

GEXP3.(x,y) = x y = 2 W is reconstructed as: 
GEXP3.(x,y) = 2 wl -2 w2 

2 wl is evaluated by table lookup and 2 w2 is evaluated from an- 
other rational approximation. 



4-32 TOPS-10/TOPS-20 Common Math Library Reference Manual 



Error Conditions 

1. If the base is a negative value and the exponent is not an integer, the 
following message is issued and the calculation proceeds using the abso- 
lute value of the base. 

GEXP3.: Negative base**non-integer; ABS(base) used 

2. If the base is 0.0 and the exponent is negative, the following message is 
issued and the result is set to infinity. 

GEXP3.: Zero**negative exponent; result = infinity 

3. If both the base and the exponent are 0.0, the following message is issued 
and the result is set to 0.0. 

GEXP3.: Zero**zero is indeterminate, result = zero 

4. If ylog 2 (x) > 1023, the result overflows, the following message is issued, 
and the result is set to -infinity if x less than 0.0 and y is an odd integer. 
Otherwise, the result is set to + infinity. 

GEXP3.: Result overflow 

5. If ylog2(x) < -1025, the result underflows, the following message is issued, 
and the result is set to 0.0. 

GEXP3.: Result underflow 
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CEXP3. 



Description 

The CEXP3. routine raises a complex, single-precision, floating-point number 
to the power of another complex, single-precision, floating-point number. 
That is: 

CEXP3.(z,g) = z g 

Routines Called 

CEXP3. calls the CDLOG, DLOG, DSIN, DCOS, DEXP, and MTHERR 

routines. 

Type of Arguments 

There are two arguments; both must be complex, single-precision, floating- 
point values. They can be any such values. 

Type of Result 

The result returned is a complex, single-precision, floating-point value. It may 
be any such value. 



Accuracy of Result 



test interval: 



MRE: 



RMS: 



LSB error distribution: 



.50000 through 1.0000 for z (real) 
.50000 through 1.0000 for z (imaginary) 
-100.00 through 207.00 for g (real) 
-163.00 through 7.00 for g (imaginary) 

7.45xl0- 9 (27.0 bits) real 
7.45xl0~ 9 (27.0 bits) imaginary 

3.17xl(r 9 (28.2 bits) real 
3.17xl0' 9 (28.2 bits) imaginary 



-2 


-1 


+1 


+2 


0% 


0% 


100% 0% 


0% real 


0% 


0% 


100% 0% 


0% imaginary 



When the ratio of the imaginary part of the base to the real part is less than 
-10 u) , one part of the result is less accurate. Which part is less accurate 
depends on the exponent. For example: 



test interval: 



LSB error distribution: 



test interval: 



LSB error distribution: 



-l.OOOOOxlO 10 through -l.OOOOOxlO 15 for z (real) 
-2.0000 through -1.0000 for z (imaginary) 
(-1,0) for g 

-2 -1 +1 +2 

0% 6% 65% 28% 2% real 

0% 0% 100% 0% 0% imaginary 

-l.OOOOOxlO 10 through -l.OOOOOxlO" 15 for z (real) 
-2.0000 through -1.0000 for z (imaginary) 
(2,0) for g 

-2 -1 +1 +2 

0% 0% 100% 0% 0% real 

6% 27% 60% 8% 0% imaginary 
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Algorithm Used 

CEXP3.(z,g) is calculated as follows. 

Let z = x+i»y 
g = a+i*b 

First the routine checks for the special cases shown in the following table. 

Special Cases for CEXP3. 

Value of x Value of y Value of a Result 

0.0 0.0 >0.0 (0.0,0.0) 

0.0 0.0 <0.0 (+infinity, +infinity) 

0.0 0.0 0.0 (0.0,0.0) 

If none of the special cases applies, the routine continues calculation as 
follows. 

If x and y =£ 
x+i*y is rewritten as 

glog e (x+i-y) 

The CEXP3. function is evaluated as the complex exponential of 
(a+i-b)-(LNRHO+i-THETA). 
LNRHO is the real part of: 

log e (x+i*y) 
THETA is the imaginary part of: 

log e (x+i-y) 
The real part of (a+i-b)-(LNRHO+i-THETA) is: 

ALPHA = a-LNRHO-b-THETA 
and the imaginary part is: 

PHI = a-THETA+b-LNRHO 

Since it is ultimately e iPHI that is needed, it would appear that sin(PHI) 
and cos(PHI) are needed. However, these functions will be multiplied by 
e ALPHA , and the handling of exception boundaries on the product will be 
expedited by use of log e (sin(PHI)) and log e (cos(PHI)), which will be added 
to ALPHA before the call to the DEXP function. The absolute values of 
sin(PHI) and cos(PHI) are used as arguments of the CDLOG function; the 
signs of sin(PHI) and cos(PHI) are stored for use in determining the signs 
for the real and imaginary parts of the complex exponential, CEXP. 

The real part of the final result is: 

sgn(cos(PHI)) •e ALPHA+loge(l cos(PHI)l) 

The imaginary part of the final result is: 
sgn(sin(PHI) ) • e ALPHA+lo Ke(' «n(PHun 
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Error Conditions 

1. If both the real and imaginary parts of both arguments are 0.0, the follow- 
ing message is issued and the result is set to (0.0,0.0). 

CEXP3.: Zero**zero is indeterminate; result = zero 

2. If both the real and imaginary parts of the base are zero and the real part 
of the exponent is negative, the following message is issued and the result 
is set to (+infinity,+infinity). 

CEXP3.: Zero**(negative,non-zero) is indeterminate, 
result = (infinity.infinity) 

3. If PHI > 6746518852, argument reduction for sin/cos is impossible so the 
following message is issued and the result is set to (+infinity,+infinity). 

CEXP3.: Both parts indeterminate 

4. If the base and/or the exponent are such that one or both parts of the 
result overflow, one of the following messages is issued and the corre- 
sponding result is set to ± infinity. 

CEXP3.: Real part overflow 
CEXP3.: Imaginary part overflow 
CEXP3.: Both parts overflow 

5. If the base and/or the exponent are such that one or both parts of the 
result underflows, one of the following messages is issued and the corre- 
sponding result is set to (0.0). 

CEXP3.: Real part underflow 

CEXP3.: Imaginary part underflow 

CEXP3.: Real and imaginary parts underflow 
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Chapter 5 
Trigonometric Routines 



SIN 



Description 

The SIN routine calculates the single-precision, floating-point sine of the 
single-precision, floating-point angle given in radians as the argument. That 
is: 

SIN(x) = sin(x) 

Routines Called 

SIN calls the MTHERR routine. 

Type of Argument 

The argument must be a single-precision, floating-point value less than or 
equal to 210828714. 

Type of Result 

The result returned is a single-precision, floating-point value in the range -1.0 
to 1.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution 



-10.000 through 201.06 

L95xl0" 8 (25.6 bits) 

3.87xl0- 9 (27.9 bits) 

-2 -1 +1 +2 

0% 12% 78% 10% 0% 



Algorithm Used 

SIN(x) is calculated as follows. Note that SIN(x) = -SIN(-x). 

Let Ixl = 7r*n+f 
If I < tt/2 

The argument reduction is as follows. 

n = the nearest integer to IxI/t 
Then the reduced argument is: 

f = lxl-7r*n 

If If I < 863167530X10" 4 
sin(f) - f 

Otherwise 
sin(f) = f+f -R(g) 

g = f 2 

R(g) = ((((r5«g+r4)»g+r3)-g+r2)'g+rl)«g 

rl = -.166666666 

r2 = .833333072X10- 2 

r3 = -.198408328xl0" 3 

r4 = .275239711X10 5 

r5 = -.238683464xl0 7 

Finally 

SIN(x) = sgn(xW-l) n -sin(f) 
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Error Conditions 

If the absolute value of the argument is greater than 210828714, the following 
message is issued and the result is set to 0.0. 

SIN: ABS(arg) too large; result = zero 
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SIND 



Description 

The SIND routine calculates the single-precision, floating-point sine of the 
single-precision, floating-point angle given in degrees as the argument. That 

is: 

SIND(x) = sin(x) 

Routines Called 

SIND calls the MTHERR routine. 

Type of Argument 

The argument must be a single-precision, floating-point value less than or 
equal to 47185919. 

Type of Result 

The result returned is a single-precision, floating-point value in the range -1.0 
to 1.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-1000.0 through 3600.0 
1.95X10- 8 (25.6 bits) 
4.11X10 9 (27.9 bits) 

-2 -1 +1 +2 

0% 13% 73% 14% 0% 



Algorithm Used 

SIND(x) is calculated as follows. Note that SIND(x) = -SIND(-x). 

Let ixl = 180-n+f 
Ifl < 90 

The argument reduction is as follows. 

n = the nearest integer to I x 1/180 
Then the reduced argument, converted to radians is: 

f = (lxl-180-n)-(7r/180) 

If Ifl < 863167530x10" 4 
sin(f) = f 

Otherwise 
sin(f) = f+f -R(g) 

g = f 2 

R(g) = ((((r5«g+r4)*g+r3)*g+r2)'g+rl)*g 
rl = -.166666666 
r2 = .833333072xl0~ 2 
r3 = -.198408328X1O" 3 
r4 = .275239711xl0~ 5 
r5 = -.238683464X10- 7 

Finally 

SIND(x) = sgn(x)-(-l) n -sin(f) 

Trigonometric Routines 5-5 



Error Conditions 

If the absolute value of the argument is greater than 47185919, the following 
message is issued and the result is set to 0.0. 

SIND: ABS(arg) too large; result = zero 
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COS 



Description 

The COS routine calculates the single-precision, floating-point cosine of the 
single-precision, floating-point angle given in radians as the argument. That 
is: 

COS(x) = cos(x) 

Routines Called 

COS calls the MTHERR routine. 

Type of Argument 

The argument must be a single-precision, floating-point value less than 
210828714. 

Type of Result 

The result returned is a single-precision, floating-point value in the range -1.0 
to 1.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-10.000 through 201.06 

1.86xl0- 8 (25.7 bits) 

4.26X10" 9 (27.8 bits) 

-2 -1 +1 +2 

0% 12% 70% 17% 0% 



Algorithm Used 

COS(x) is calculated as follows. Note that COS(x) = COS(-x). 

Let Ixl = 7r*n+f 
If I < tt/2 

The argument reduction is as follows. 

n = .5 + the nearest integer to IxlAr 
Then the reduced argument is: 

f = lxl-7r*n 

If If I < .863167530X10" 4 
sin(f) = f 

Otherwise 

sin(f) = f+f -R(g) 

g = f 2 

R(g) = ((((r5«g+r4)'g+r3)*g+r2)'g+rl)«g 
rl = -.166666666 
r2 = .833333072X10 2 
r3 = -.198408328X10 3 
r4 = .275239711X10- 5 
r5 = -.238683464xl0- 7 

Finally 

COS(x) = (-l) n+1 -sin(f) 
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Error Conditions 

If the absolute value of the argument is greater than or equal to 210828714, the 
following message is issued and the result is set to 0.0. 

COS: ABS(arg) too large; result = zero 
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COSD 



Description 

The COSD routine calculates the single-precision, floating-point cosine of the 
single-precision, floating-point angle given in degrees as the argument. That 
is: 

COSD(x) = cos(x) 

Routines Called 

COSD calls the MTHERR routine. 

Type of Argument 

The argument must be a single-precision, floating-point value less than 
47185919. 

Type of Result 

The result returned is a single-precision, floating-point value in the range -1.0 
to 1.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-1000.0 through 3600.0 
1.75XKT 8 (25.8 bits) 
4.20X10 9 (27.8 bits) 

-2 -1 +1 +2 

0% 12% 72% 16% 0% 



Algorithm Used 

COSD(x) is calculated as follows. Note that COSD(x) = COSD(-x). 

Let Ixl = 180«n+f 
If I < 90 

The argument reduction is: 

n = .5+ the nearest integer to I x 1/1 80 
Then the reduced argument, converted to radians, is: 

f = (lxl-180«n)'0r/180) 

If Ifl < .863167530xl0~ 4 
sin(f) = f 

Otherwise 
sin(f) = f+f -R(g) 

g = f 2 

R(g) = ((((r5*g+r4)*g+r3)«g+r2)*g+rl)»g 
rl = -.166666666 
r2 = .833333072X10" 2 
r3 = -.198408328X10" 3 
r4 = .275239711X10- 5 
r5 = -.238683464xl0- 7 

Finally 

COSD(x) = (-l) n+1 -8in(f) 

Trigonometric Routines 5-9 



Error Conditions 

If the absolute value of the argument is greater than or equal to 47185919, the 
following message is issued and the result is set to 0.0. 

COSD: ABS(arg) too large; result = zero 
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DSIN 



Description 

The DSIN routine calculates the double-precision, D-floating-point sine of the 
double-precision, D-floating-point angle given in radians as the argument. 
That is: 

DSIN(x) = sin(x) 

Routines Called 

DSIN calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, D-floating-point value less than or 
equal to 6746518852 (or 2 31 -ir). 

Type of Result 

The result returned is a double-precision, D-floating-point value in the range 
-1.0 to 1.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-10.000 through 201.06 

6.06xl0 19 (60.5 bits) 

1.35X10 19 (62.7 bits) 

-2 -1 +1 +2 

0% 22% 68% 10% 0% 



Algorithm Used 

DSIN(x) is calculated as follows. Note that DSIN(x) = -DSIN(-x) 

Let Ixl = 7r # n+f 
Ifl < tt/2 

The argument reduction is as follows, 
f = ((Ixl-n«cl)-n-c2)-n'c3 
cl = high-order 34 bits of x 
c2 = next 31 bits of w 
c3 = next 62 bits of w 

If Ifl < 2~ 31 
sin(f) = f 
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Otherwise 
sin(f) = f +f -R(g) 

R(g) = (g-XNUM/XDEN+rpl)-g 

XNUM = ((rp5«g+rp4)»g+rp3)«g+rp2 
XDEN = ((g-q2)-g+ql)-g+q0 
rpl = -.166666666666666667 
rp2 = .451456904704461990X10 6 
rp3 = -.489487151969463797X10 3 
rp4 - .428183075897778265x10 
rp5 - -.121560740596710190X10 1 
qO = .541748285645351853X10 7 
ql = .702492288221842518X1O 5 
q2 = .394924723520450141X10 3 

Finally 

DSIN(x) = sgn(x)»(~l) n -sin(f) 

Error Conditions 

If the absolute value of the argument is greater than 6746518850, the following 
message is issued and the result is set to 0.0. 

DSIN: ABS(arg) too large; result = zero 
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DCOS 



Description 

The DCOS routine calculates the double-precision, D-floating-point cosine of 
the double-precision, D-floating-point angle given in radians as the argument. 
That is: 

DCOS(x) = cos(x) 

Routines Called 

DCOS calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, D-floating-point value less than 
6746518852 (or 2 31 -tt). 

Type of Result 

The result returned is a double-precision, D-floating-point value in the range 
-1.0 to 1.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-10.000 through 201.06 

4.96xl0~ 19 (60.8 bits) 

1.41X10 19 (62.6 bits) 

-2 -1 +1 +2 

0% 16% 66% 18% 0% 



Algorithm Used 

DCOS(x) is calculated as follows. Note that DCOS(x) = DCOS(-x). 

Let Ixl = 7r*n+f 
!fl< tt/2 

The argument reduction is as follows, 
f = (Ixl-n*cl)-n*c2)-n'c3 

cl = high-order 34 bits of w 
c2 = next 31 bits of w 
c3 = next 62 bits of it 

If Ifl < 2~ 31 
sin(f) = f 
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Otherwise 
sin(f) = f+f -R(g) 

R(g) = (g-XNUM/XDEN+rpl)-g 

XNUM = ((rp5«g+rp4)«g+rp3)«g+rp2 
XDEN = ((g-q2)-g+ql)-g+q0 
rpl = .166666666666666667 
rp2 = .451456904704461990X10 5 
rp3 = -.489487151969463797X10 3 
rp4 = .428183075897778265x10 
rp5 = -.121560740596710190X10" 1 
qO = .541748285645351853xl0 7 
ql = .702492288221842518xlO 5 
q2 = .394924723520450141X10 3 

Finally 

DCOS(x) = (-l) n+1 -sin(f) 

Error Conditions 

If the absolute value of the argument is greater than or equal to 6746518852, 
the following message is issued and the result is set to 0.0. 

DCOS: ABS(arg) too large; result = zero 
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GSIN 



Description 

The GSIN routine calculates the double-precision, G-floating-point sine of the 
double-precision, G-floating-point angle given in radians as the argument. 
That is, 

GSIN(x) = sin(x) 

Routines Called 

GSIN calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, G-floating-point value less than or 
equal to 1686629713 (or 2 29 -tt). 

Type of Result 

The result returned is a double-precision, G-floating-point value in the range 
-1.0 to 1.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-10.000 through 201.06 
3.30X10 18 (58.1 bits) 
8.85X10 19 (60.0 bits) 

-2 -1 +1 +2 

0% 13% 78% 9% 0% 



Algorithm Used 

GSIN(x) is calculated as follows. Note that GSIN(x) = -GSIN(-x). 

Let Ixl = 7r»n+f 
Ifl < tt/2 

The argument reduction is as follows, 
f = ((Ixl-n-cl)-n«c2)-n«c3 
cl = high-order 30 bits of w 
c2 = next 28 bits of it 
c3 = next 62 bits of tt 

If Ifl < 2~ 30 
sin(f) = f 
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Otherwise 
sin(f) = f+f -R(g) 

g = f 2 

R(g) = (g-XNUM/XDEN+rpl)-g 

XNUM = ((rp5»g+rp4)*g+rp3)*g+rp2 
XDEN = ((g-q2).g+ql).g-q0 
rpl = -.166666666666666667 
rp2 = .451456904704461990x10 s 
rp3 = -.489487151969463797X10 3 
rp4 = .428183075897778265X10 1 
rp5 = -.121560740596710190X10 1 
qO = .541748285645351853xl0 7 
ql = .702492288221842518X10 5 
q2 = .394924723520450141X10 3 

Finally 

GSIN(x) = sgn(x)-(-l) n -sin(f) 

Error Conditions 

If the absolute value of the argument is greater than 1686629713, the following 
message is issued and the result is set to 0.0. 

GSIN: ABS(arg) too large; result = zero 
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GCOS 



Description 

The GCOS routine calculates the double-precision, G-floating-point cosine of 
the double-precision, G-floating-point angle given in radians as the argument. 
That is: 

GCOS(x) = cos(x) 

Routine Called 

GCOS calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, G-floating-point value less than 
1686629713 (or 2 29 -tt). 

Type of Result 

The result returned is a double-precision, G-floating-point value in the range 
-1.0 to 1.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-10.000 through 201.06 
3.44xl0- 18 (58.0 bits) 
9.84X10" 19 (59.8 bits) 

-2 -1 +1 +2 

0% 14% 72% 15% 0% 



Algorithm Used 

GCOS(x) is calculated as follows. Note that GCOS(x) = GCOS(-x) 

Let Ixl = x'n+f 

Ifl < tt/2 

The argument reduction is as follows, 
f =.((lxl-n«cl)-n«c2)-n«c3 
cl = high-order 30 bits of ir 
c2 = next 28 bits of ir 
c3 = next 62 bits of ir 

If Ifl < 2" 30 
sin(f) = f 
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Otherwise 
sin(f) = f+f -R(g) 

g = f 2 

R(g) = (g-XNUM/XDEN+rpl)-g 

XNUM = ((rp5'g+rp4)*g+rp3)«g+rp2 
XDEN = ((g-q2)-g+ql)-g+qO 





rpl = 


-.166666666666666667 




rp2 = 


.451456904704461990X10 5 




rp3 = 


-.489487151969463797X10 3 




rp4 = 


.428183075897778265X10 1 




rp5 = 


-.121560740596710190X10- 1 




q0 = 


.541748285645351853X10 7 




ql = 


.702492288221842518X10 5 




q2 = 


.394924723520450141X10 3 


Finally 






GCOS(x) = 


= (-l) n+1 -sin(f) 



Error Conditions 

If the absolute value of the argument is greater than or equal to 1686629713, 
the following message is issued and the result is set to 0.0. 

GCOS: ABS(arg) too large; result = zero 
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CSIN 



Description 

The CSIN routine calculates the complex, single-precision, floating-point sine 
of the complex, single-precision, floating-point angle given in radians as the 
argument. That is: 

CSIN(z) = sin(z) 

Routines Called 

CSIN calls the SIN, COS, EXP, ALOG, and MTHERR routines. 

Type of Argument 

The argument must be a complex, single-precision, floating-point value, the 
real part of which must be less than 210828714 (or 2 26 *tt). 

Type of Result 

The result returned is a complex, single-precision, floating-point value; it may 
be any such value. 

Accuracy of Result 

, ^ . . , -200.00 through 200.00 real 
test interval: * n ~ nn ,, , . rt ™~ . 



-10.000 through 10.000 imaginary 

3.30xl0" 8 (24.9 bits) real 
3.44xl0 -8 (24.8 bits) imaginary 

7.68x10 9 (27.0 bits) real 
6.75xl0~ 9 (27.1 bits) imaginary 



-2 


-1 





+1 


+2 


2% 


23% 


51% 


22% 


2% real 


1% 


19% 


57% 


22% 


1% imaginary 



MRE: 
RMS: 

LSB error distribution: 



Algorithm Used 

CSIN(z) is calculated as follows. 

Let z = x+i«y 

If Ixl > 210828714 

CSIN(z) = (0.0,0.0) 

If lyl > 88.029692, calculation proceeds as follows. 

For the real part of the result: 
Let t = lsin(x)l 

If t = 0.0 
x - 0.0 

If log e (t)+lyl> 88.722839 

x = ± machine infinity 
(88.722839 = 88.029692+log e (2)) 
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For the imaginary part of the result: 
Let t = lcos(x)l=£0 

If log e (t)+ly I < 88.722839 
y = ± infinity 

Otherwise 

CSIN(z) = sinCxJ'coshCyJ+i'cosM'sinhCy) 

Error Conditions 

1. If the absolute value of the real part of the argument is greater than 
210828714, the following message is issued and the result is set to (0.0,0.0). 

CSIN: ABS(REAL(arg)) too large; result = zero 

2. If lyl+log e (lsin(x)l) > 88.722839, the real part overflows. If 
lyl+log e (lcos(x)) > 88.722839, the imaginary part overflows. If either part 
overflows, one of the following messages is issued and the relevant part of 
the result is set to ± machine infinity. 

CSIN: Imaginary part overflow 
CSIN: Real part overflow 

3. If the imaginary part of the result is too small a number, the following 
message is issued and the imaginary part of the result is set to 0.0. 

CSIN: Imaginary part underflow 
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ccos 



Description 

The CCOS routine calculates the complex, single-precision, floating-point 
cosine of the complex, single-precision, floating-point angle given in radians 
as the argument. That is: 

CCOS(z) = cos(z) 

Routines Called 

CCOS calls the SIN, COS, EXP, ALOG, and MTHERR routines. 

Type of Argument 

The argument must be a complex, single-precision, floating-point value, the 
real part of which must be less than 210828714 (or 2 26 »tt). 

Type of Result 

The result returned is a complex, single-precision, floating-point value; it may 
be any such value. 

Accuracy of Result 

A , . A , -200.00 through 200.00 real 
test interval: „ rt nnn „ , ..« nnn . 



-10.000 through 10.000 imaginary 

3.35xl0" 8 (24.8 bits) real 
3.57xl0" 8 (24.7 bits) imaginary 

7.76X10 9 (26.9 bits) real 
6.68xl0~ 9 (27.2 bits) imaginary 



-2 


-1 





+1 


+2 


2% 


20% 


50% 


25% 


3% real 


1% 


20% 


57% 


20% 


1% imaginary 



MRE: 
RMS: 

LSB error distribution: 



Algorithm Used 

CCOS(z) is calculated as follows. 

Let z = x+i # y 

If Ixl > 210828714 

CCOS(z) = (0.0,0.0) 

If lyl > 88.029692 calculation proceeds as follows. 

For the real part of the result: 
Let t = lcos(x)l^0 

If log e (t)+lyl> 88.722839 
x = ± machine infinity 

(88.722839 = 88.029692+log e (2)) 
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For the imaginary part of the result: 
Let t = lsin(x)l 

If t = 0.0 
y = 0.0 

If loge(t)+lyl> 88.722839 
y = ± machine infinity 

Otherwise 

CCOS(z) = cos(x)*cosh(y)-i # sin(x)*sinh(y) 

Error Conditions 

1. If the absolute value of the real part of the argument is greater than 
210828714, the following message is issued and the result is set to (0.0,0.0). 

CCOS: ABS(REAL(arg)) too large: result = zero 

2. If lyl+log e (lcos(x)l) > 88.722839, the real part overflows. If 
lyl+log e (lsin(x)l) > 88.722839, the imaginary part overflows. If either part 
overflows, one of the following messages is issued and the relevant part of 
the result is set to + machine infinity. 

CCOS: Imaginary part overflow 
CCOS: Real part overflow 

3. If the imaginary part of the result is too small a number, the following 
message is issued and the imaginary part of the result is set to 0.0. 

CCOS: Imaginary part underflow 
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CDSIN 



Description 

The CDSIN subroutine calculates the complex, double-precision, D-floating- 
point sine of the complex, double-precision, D-floating-point angle given in 
radians as the argument. That is: 

CDSIN(z,r) = sin(z) 

z = location of input value 
r = location of result 

Routines Called 

CDSIN calls the DSIN, DCOS, DEXP, DLOG, and MTHERR routines. 

Type of Argument 

CDSIN is a subroutine that is called with two arguments. Both arguments 
must be two-element, double-precision vectors. The first vector (z) contains 
the input value; the second vector (r) will contain the result. The real part of 
the input value must be stored in the first element of z; the imaginary part 
must be stored in the second element of z. The input value must be a com- 
plex, double-precision, D-floating-point value, the real part of which must be 
less than 2 31 »tt -tt/2. 

Type of Result 

The result returned is a complex, double-precision, D-floating-point value; it 
may be any such value. It is returned in the second vector (r) supplied in the 
call. The real part of the result is returned in the first element of r; the 
imaginary part is returned in the second element of r. 



Accuracy of Result 

test interval: 

MRE: 
RMS: 



-200.00 through 200.00 real 
-10.000 through 10.000 imaginary 

1.09xl0" 18 (59.7 bits) real 
9.86xl0" 19 (59.8 bits) imaginary 

2.22x10 l9 (62.0 bits) real 
2.08xl0~ 19 (62.1 bits) imaginary 



LSB error distribution: 



-2 


-1 





+1 


+2 


2% 


22% 


51% 


23% 


2% real 


2% 


26% 


54% 


17% 


1% imaginary 
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Algorithm Used 

CDSIN(z) is calculated as follows. 

Let z = x+i*y 

If lxl>2 31 -7r-ir/2 

CDSIN(z) = (0.0,0.0) 

If !yl > 88.029692, calculations proceed as follows. 

For the real part of the result: 
Let t = lsin(x)l 

If t = 0.0 
x = 0.0 

If log e (t)+lyl> 88.722839 
x = ± infinity 
(88.722839 = 88.029692+log e (2)) 

For the imaginary part of the result: 
Let t = lcos(x)l^0 

If log e (t)+ly I > 88.722839 
y = ± infinity 

Otherwise 

CDSIN(z) = sin(x)«cosh(y)+i»cos(x)*sinh(y) 

Error Conditions 

1. If the absolute value of the real part of the argument is greater than 
2 31 # 7r - 7r/2, the following message is issued and the result is set to (0.0,0.0). 

CDSIN: ABS(REAL(arg)) too large; result = zero 

2. If lyl + log e (lsin(x)l) > 88.722839, the real part overflows. If 
lyl+log e (lcos(x)l) > 88.722839, the imaginary part overflows. If either part 
overflows, one of the following messages is issued and the relevant part of 
the result is set to ± machine infinity. 

CDSIN: ABS(IMAG(arg)) too large; REAL(result) = infinity 
CDSIN: ABS(IMAG(arg)) too large; IMAG(result) = infinity 

3. If the imaginary part of the result is too small a number, the following 
message is issued and the imaginary part of the result is set to 0.0. 

CDSIN: Imaginary part underflow 
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CDCOS 



Description 

The CDCOS subroutine calculates the complex, double-precision, D-floating- 
point cosine of the complex, double-precision, D-floating-point angle given in 
radians as the argument. That is: 

CDCOS(z) = cos(z) 

z = location of input value 
r = location of result 

Routines Called 

CDCOS calls the DSIN, DCOS, DEXP, DLOG, and MTHERR routines. 

Type of Argument 

CDCOS is a subroutine that is called with two arguments. Both arguments 
must be two-element, double-precision vectors. The first vector (z) contains 
the input value; the second vector (r) will contain the result. The real part of 
the input value must be stored in the first element of z; the imaginary part 
must be stored in the second element of z. The input value must be a com- 
plex, double-precision, D-floating-point value, the real part of which must be 
less than 2 31 *7r-7r/2. 

Type of Result 

The result returned is a complex, double-precision, D-floating-point value; it 
may be any such value. It is returned in the second vector (r) supplied in the 
call. The real part of the result is returned in the first element of r; the 
imaginary part is returned in the second element of r. 



Accuracy of Result 

test interval: 

MRE: 
RMS: 



-200.00 through 200.00 real 
-10.000 through 10.000 imaginary 

9.89X10 19 (59.8 bits) real 
9.98xl0 19 (59.8 bits) imaginary 

2.25xl0 19 (61.9 bits) real 
2.03xl0 19 (62.1 bits) imaginary 



LSB error distribution: 



-2 


-1 





+1 


+2 


3% 


24% 


50% 


21% 


2% real 


1% 


21% 


55% 


21% 


1% imaginary 
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Algorithm Used 

CDCOS(z) is calculated as follows. 

Let z = x+i»y 

If Ixl > 2 31 -7r-7r/2 

CDCOS(z) = (0.0,0.0) 

If lyl > 88.029692, calculation proceeds as follows. 

For the real part of the result: 
Let t = lcos(x)l^0 

If log e (t)+lyl> 88.722839 
x = ± infinity 
(88.722839 = 88.029692 +log e ( 2)) 

For the imaginary part of the result: 
Let t = lsin(x)l 

If t = 0.0 
y = 0.0 

If log e (t)+lyl> 88.722839 
y = ± infinity 
Otherwise 
CDCOS(z) = cos(x)*cosh(y)-i , sin(x) , sinh(y) 

Error Conditions 

1. If the absolute value of the real part of the argument is greater than 
2 3l, 7r-7r/2, the following message is issued and the result is set to (0.0,0.0). 

CDCOS: ABS(REAL(arg)) too large; result = zero 

2. If lyl + log e (lcos(x)l) > 88.722839, the real part overflows. If 
lyl+log e (lsin(x)l) > 88.722839, the imaginary part overflows. If either part 
overflows, one of the following messages is issued and the relevant part of 
the result is set to ± machine infinity. 

CDCOS: ABS(IMAG(arg)) too large; REAL(result) = infinity 
CDCOS: ABS(IMAG(arg)) too large; IMAG(result) = infinity 

3. If the imaginary part of the result is too small a number, the following 
message is issued and the imaginary part of the result is set to 0.0 

CDCOS: Imaginary part underflow 
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CGSIN 



Description 

The CGSIN subroutine calculates the complex, double-precision, G-floating- 
point sine of the complex, double-precision, G-floating-point angle given in 
radians as the argument. That is, 

CGSIN(z,r) = sin(z) 

z = location of input value 
r = location of result 

Routines Called 

CGSIN calls the GSIN, GCOS, GEXP, GLOG, and MTHERR routines. 

Type of Argument 

CGSIN is a subroutine that is called with two arguments. Both arguments 
must be two-element, double-precision vectors. The first vector (z) contains 
the input value; the second vector (r) will contain the result. The real part of 
the input value must be stored in the first element of z; the imaginary part 
must be stored in the second element of z. The input value must be a com- 
plex, double-precision, G-floating-point value, the real part of which must be 
less than 2 29 -7r-7r/2. 

Type of Result 

The result returned is a complex, double-precision, G-floating-point value; it 
may be any such value. It is returned in the second vector (r) supplied in the 
call. The real part of the result is returned in the first element of r; the 
imaginary part is returned in the second element of r. 



Accuracy of Result 

test interval: 

MRE: 
RMS: 



-200.00 through 200.00 real 
-10.000 through 10.000 imaginary 

7.35xl0~ 18 (56.9 bits) real 
7.01xl0" 18 (57.0 bits) imaginary 

1.76xl0- 18 (59.0 bits) real 
1.61xl0" 18 (59.1 bits) imaginary 



LSB error distribution: 



-2 


-1 





+1 


+2 


2% 


22% 


51% 


23% 


2% real 


1% 


20% 


55% 


22% 


2% imaginary 
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Algorithm Used 

CGSIN(z) is calculated as follows. 

Let z = x+i # y 

If lxl>2 29 -7r-7r/2 

CGSIN(z) = (0.0,0.0) 

If lyl > 709.089565712824, calculation proceeds as follows. 

For the real part of the result: 
Let t = lsin(x)l 

If t = 0.0 
x = 0.0 

If log e (t)+lyl > 709.782712893384 
x = ± machine infinity 
(709.782712893384 = 709.089565712824+log e (2)) 

For the imaginary part of the result: 
Let t = lcos(x)l*0.0 

If log e (t)+lyl > 709.782712893384 
y = ± machine infinity 

Otherwise 

CGSIN(z) = sin(x)*cosh(x)+i , cos(x) , sinh(y) 

Error Conditions 

1. If the absolute value of the real part of the argument is greater than 
2 29, 7r-7r/2, the following message is issued and the result is set to (0.0,0.0). 

CGSIN: ABS(REAL(arg)) too large; result = zero 

2. If ly l+log e (lsin(x)l) > 709.782712893384, the real part of the result will over- 
flow. If lyl+log e (lcos(x)i) > 709.782712893384, the imaginary part of the 
result will overflow. Any overflowed result is set to ±machine infinity and 
one of the following messages is issued. 

CGSIN: ABS(IMAG(arg)) too large; REAL(result) = infinity 

CGSIN: AGS(IMAG(arg)) too large; IMAG(result) = infinity 

3. If the imaginary part of the result underflows, the following message is 
issued and the imaginary part of the result is set to 0.0. 

CGSIN: Imaginary part underflow 
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CGCOS 



Description 

The CGCOS subroutine calculates the complex, double-precision, G-floating- 
point cosine of the complex, double-precision, G-floating-point angle given in 
radians as the argument. That is: 

CGCOS(z,r) = cos(z) 

z = location of input value 
r = location of result 

Routines Called 

CGCOS calls the GSIN, GCOS, GEXP, GLOG, and MTHERR routines. 

Type of Argument 

CGCOS is a subroutine that is called with two arguments. Both arguments 
must be two-element, double-precision vectors. The first vector (z) contains 
the input value; the second vector (r) will contain the result. The real part of 
the input value must be stored in the first element of z; the imaginary part 
must be stored in the second element of z. The input value must be a com- 
plex, double-precision, G-floating-point value, the real part of which must be 
less than 2 29, 7r--x/2. 

Type of Result 

The result returned is a complex, double-precision, G-floating-point value; it 
may be any such value. It is returned in the second vector (r) supplied in the 
call. The real part of the result is returned in the first element of r; the 
imaginary part is returned in the second element of r. 



Accuracy of Result 

test interval: 

MRE: 
RMS: 



-200.00 through 200.00 real 
-10.000 through 10.000 imaginary 

8.31X10 18 (56.7 bits) real 
7.00xl0" 18 (57.0 bits) imaginary 

1.83xl0 18 (58.9 bits) real 
1.53xl(T 18 (59.2 bits) imaginary 



LSB error distribution: 



-2 


-1 





+1 


+2 


2% 


20% 


50% 


25% 


3% real 


2% 


20% 


58% 


20% 


1% imaginary 
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Algorithm Used 

CGCOS(z) is calculated as follows. 

Let z = x+i*y 

If lxl>2 29 -7r-7r/2 

CGCOS(z) = (0.0,0.0) 

If lyl > 709.089565712824, calculation proceeds as follows. 

For the real part of the result: 
Let t = lcos(x)l^0.0 

If log e (t)+lyl > 709.782712893384 
x = ± machine infinity 
(709.782712893384 = 709.089565712824+log e (2)) 

For the imaginary part of the result: 

Let t = lsin(x)l 

If t = 0.0 
y = 0.0 

If log e (t)+lyl > 709.782712893384 
y = ± machine infinity 
Otherwise 
CGCOS(z) = cos(x)»cosh(y)-i»sin(x)*sinh(y) 

Error Conditions 

1. If the absolute value of the real part of the argument is greater than 
2 29, 7r-7r/2, the following message is issued and the result is set to (0.0,0.0). 

CGCOS: ABS(REAL(arg)) too large; result = zero 

2. If lyl+log e (lcos(x)l) > 709.782712893384, the real part of the result will 
overflow. If lyl+log e (lsin(x)l) > 709.782712893384, the imaginary part of the 
result will overflow. Any overflowed result is set to ±machine infinity and 
one of the following messages is issued. 

CGCOS: ABS(IMAG(arg)) too large; REAL(result) = infinity 

CGCOS: ABS(IMAG(arg)) too large; IMAG(result) = Infinity 

3. If the imaginary part of the result underflows, the following message is 
issued and the imaginary part is set to 0.0. 

CGCOS: Imaginary part underflow 
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TAN 



Description 

The TAN routine calculates the single-precision, floating-point tangent of the 
single-precision, floating-point angle given in radians as the argument. That 
is: 

TAN(x) = tan(x) 

Routines Called 

TAN calls the MTHERR routine. 

Type of Argument 

The argument must be a single-precision, floating-point value less than or 
equal to 2 26 »tt/2. 

Type of Result 

The result returned is a single-precision, floating-point value; it may be any 
such value. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-10.000 through 201.06 
2.35xl0~ 8 (25.3 bits) 
5.28xl0 9 (27.5 bits) 



-2 

0% 



13 



7o 





70% 



+1 
16% 



+2 
0% 



Algorithm Used 

TAN(x) is calculated as follows. 

If Ixl > 2 26 -tt/2 
TAN(x) = 0.0 

Otherwise, the identities: 

tan(ir/2.0-g) = 1.0/tan(g) 

tan(n*7r+h) = tan(h) where -tt/2.0 < h < 7r/2.0 

tan(-x) = -tan(x) 
are used to reduce TAN(x) to a problem with 

-tt/2.0 < x < tt/2.0 

Then n and f are defined so that: 
x = n*Tr/4.0+f where 0.0 < f < 7r/4.0 

If f < 2 14 
tan(f) - f 
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Otherwise 

tan(f) = f-R(f 2 ) 

R(f 2 ) = ( P 0+f 2 -(pl+f 2 -p2))/(q0+f 2 -(ql+f 2 )) 
pO = 62.604 
pi = -6.9716 
p2 = 6.7309 
qO = pO 
ql = -27.839 

Then, TAN(x) can be derived if L is an integer and n has the values shown 
in the following table. 



Deriving TAN(x) 

Low-order two 
Value of n bits of n 


TAN(x) 


4l 


00 


sgn(x)*tan(f) 


4l+1 


01 


sgn(x)'(Vtan(f)) 


4l+2 


10 


sgn(x)'(-l/tan(f)) 


4l+3 


11 


sgn(x)*-tan(f) 



Reference 

Coefficients are derived flom those given in Cody and Waite, Software Man- 
ual for Elementary Functions (Englewood Cliffs, N.J.: Prentice Hall, 1980) for 
machines with 25-32 bit precision. 

Error Conditions 

If the absolute value of the argument is greater than 2 26, tt/2, the following 
message is issued and the result is set to 0.0. 

TAN: ABS(arg) too large; result = zero 
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COTAN 



Description 

The COTAN routine calculates the single-precision, floating-point cotangent 
of the single-precision, floating-point angle given in radians as the argument. 
That is: 

COTAN(x) = cot(x) 

Routines Called 

COTAN calls the MTHERR routine. 

Type of Argument 

The argument must be a single-precision, floating-point value less than or 
equal to 2 26 'tt/2 and greater than 2~ 126 *(l/2+2~ 27 ). 

Type of Result 

The result returned is a single-precision, floating-point value; it may be any 
such value. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-10.000 through 201.06 

2.42xl0 8 (25.3 bits) 

5.29xl0~ 9 (27.5 bits) 

-2 -1 +1 +2 

0% 18% 66% 16% 0% 



Algorithm Used 

COTAN(x) is calculated as follows. 

If Ixl > 2 26 -tt/2 

COTAN(x) = 0.0 

If lxl<2- 126 -(l/2+2- 27 ) 

COTAN(x) = +machine infinity 

Otherwise, the identities: 

tan(7r/2.0-g) = 1.0/tan(g) 

tan(n*7r+h) = tan(h) where -x/2.0 < h < x/2.0 

tan(-x) = -tan(x) 

cot(x) = 1.0/tan(x) 

cot(-x) = -cot(x) 
are used to reduce COTAN(x) to a problem with 

-tt/2.0 < x < tt/2.0 
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Then n and f are defined so that: 
x = n*7r/4.0+f where 0.0 < f < 7r/4.0 

Iff<2 14 
tan(f) = f 

Otherwise 

tan(f) = f-R(f 2 ) 

R(f 2 ) = ( P 0+f 2 -(pl+f 2 -p2))/(q0+f 2 -(ql+f 2 )) 
pO = 62.604 
pi = -6.9716 
p2 = 6.7309 
qO = pO 
ql = -27.839 

Then COTAN(x) can be derived if L is an integer and n has the value 
shown in the following table. 



Deriving COTAN(x) 

Low-order two 
Value of n bits of n 


COTAN(x) 


4l 


00 


sgn(x)'(l/tan(f)) 


4l+1 


01 


sgn(x)*tan(f) 


4l+2 


10 


sgn(x)*-tan(f) 


4l+3 


11 


sgn(x)'-(l/tan(f)) 



Reference 

Coefficients are derived from those given in Cody and Waite, Software Man- 
ual for Elementary Functions (Englewood Cliffs, N.J.: Prentice Hall, 1980) for 
machines with 25-32 bit precision. 

Error Conditions 

1. If the absolute value of the argument is less than 2" 126, (i/2+2 -27 ), the fol- 
lowing message is issued and the result is set to + machine infinity. 

COTAN: result overflow 

2. If the absolute value of the argument is greater than 2 26, x/2, the following 
message is issued and the result is set to 0.0. 

COTAN: ABS(arg) too large; result = zero 
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DTAN 



Description 

The DTAN routine calculates the double-precision, D-floating-point tangent 
of the double-precision, D-floating-point angle given in radians as the argu- 
ment. That is: 

DTAN(x) = tan(x) 

Routines Called 

DTAN calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, D-floating-point value less than or 
equal to 2 31 'tt/2. 

Type of Result 

The result returned is a double-precision, D-floating-point value; it may be 
any such value. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-10.000 through 201.06 
9.60x10 19 (59.9 bits) 
2.08X10 19 (62.1 bits) 

-2 -1 +1 +2 

1% 18% 55% 22% 3% 



Algorithm Used 

DTAN(x) is calculated as follows. 

If IxI>2 31 -tt/2 
DTAN(x) = 0.0 

Otherwise, the identities: 

tan(7r/2.0-g) = 1.0/tan(g) 

tan(n*7r+h) = tan(h) where -tt/2.0 < h < 7r/2.0 

tan(-x) = -tan(x) 
are used to reduce DTAN(x) to a problem with 

-tt/2.0 < x < tt/2.0 

Then n and f are defined so that: 

x = n*Tr/2.0+f where -tt/4.0 < f < tt/4.0 

If f < 2" 31 
tan(f) = f 
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Otherwise 

tan(f) = R(f) 

R(f) = (((((xp4-g+xp3)-g+xp2)-g+xpl)-g)-f+f)/ 
((((q4-g+q3)-g+q2)-g+ql)-g+1.0) 

g = M 
xpl - -.1372889460941120802 

xp2 = .3925934686364577602 -10 2 

xp3 = -.2882482747560198194 -10" 4 

xp4 = .2927308283322907641 -10~ 7 

ql = -.4706222794274454135 

q2 = .2746669449551304872-10 ! 

q3 = -.4030063705745304384-10 " 3 

q4 = .1312960309685759549- 10" 5 

If n is even 

DTAN(x) = tan(f) 

If n is odd 

DTAN(x) - -l/tan(f) 

Reference 

Coefficients are derived from those given in Cody and Waite, Software Man- 
ual for Elementary Functions, (Englewood Cliffs, N.J.: Prentice Hall, 1980) 
for machines with 25-32 bit precision. 

Error Conditions 

If the absolute value of the argument is greater than 2 31 -tt/2, the following 
message is issued and the result is set to 0.0. 

DTAN: ABS(arg) too large; result = zero 
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DCOTAN 



Description 

The DCOTAN routine calculates the double-precision, D-floating-point co- 
tangent of the double-precision, D-floating-point angle given in radians as the 
argument. That is: 

DCOTAN(x) = cot(x) 

Routines Called 

DCOTAN calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, D-floating-point value less than or 
equal to 2* 31 "ir/2 and greater than 2~ 127 «(l+2- 61 ). 

Type of Result 

The result returned is a double-precision, D-floating-point value; it may be 
any such value. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-10.000 through 201.06 
9.09xl0 19 (59.9 bits) 
2.08X10 19 (62.1 bits) 

-2 -1 +1 +2 

2% 23% 55% 19% 1% 



Algorithm Used 

DCOTAN(x) is calculated as follows. 

If Ixl > 2 31 -tt/2 

DCOTAN(x) = 0.0 

If Ixl < 2 127 -(l+2- 61 ) 

DCOTAN(x) = +machine infinity 

Otherwise, the identities: 

tan(?r/2.0-g) = 1.0/tan(g) 

tan(n*7r+h) = tan(h) where -7r/2.0 < h < 7r/2.0 

tan(-x) = -tan(x) 

cot(x) = 1.0/tan(x) 

cot(-x) = -cot(x) 
are used to reduce DCOTAN(x) to a problem with 

-tt/2.0 < x < x/2.0 

Then n and f are defined so that: 

x = n«w/2.0+f where -tt/4.0 < f < 7r/4.0 

If f < 2 31 
tan(f) = f 
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Otherwise 

tan(f) = R(f) 

R(f) = (((((xp4-g+xp3)-g+xp2)-g+xpl)-g)-f+f)/ 
((((q4-g+q3) -g+q2) -g+ql) -g+1.0) 
g = f'f 
xpl = -.1372889460941120802 
xp2 = .3925934686364577602 -lO" 2 
xp3 = -.2882482747560198194 -10" 4 
xp4 = .2927308283322907641 -10" 7 
ql = -.4706222794274454135 
q2 = .2746669449551304872 -10- 1 
q3 = -.4030063705745304384- KT 3 
q4 = .1312960309685759549-10-5 

If n is even 

DCOTAN(x) = l/tan(f) 

If n is odd 

DCOTAN(x) = -tan(f) 



References 

Coefficients are derived from those given in Cody and Waite, Software Man- 
ual for Elementary Functions, (Englewood Cliffs, N.J.: Prentice Hall, 1980) 
for machines with 25-32 bit precision. 

Error Conditions 

1. If the absolute value of the argument is greater than 2 31 -7r/2, the following 
message is issued and the result is set to 0.0. 

DCOTAN: ABS(arg) too large; result = zero 

2. If the absolute value of the argument is less than 2~ 127 -(l+(2~ 61 )), the 
following message is issued and the result is set to + machine infinity. 

DCOTAN: Result overflow 
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GTAN 



Description 

The GTAN routine calculates the double-precision, G-floating-point tangent 
of the double-precision, G-floating-point angle given in radians as the argu- 
ment. That is: 

GTAN(x) = tan(x) 

Routines Called 

GTAN calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, G-floating-point value less than or 
equal to 2 29 -tt/2. 

Type of Result 

The result returned is a double-precision, G-floating-point value; it may be 
any such value. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-10.000 through 201.06 

5.95xl0 18 (57.2 bits) 

1.43xl0- 18 (59.3 bits) 

-2 -1 +1 +2 

1% 20% 60% 18% 0% 



Algorithm Used 

GTAN(x) is calculated as follows. 

If Ixl > 2 29 -tt/2 
GTAN(x) = 0.0 

Otherwise, the identities: 

tan(7r/2.0-g) = 1.0/tan(g) 

tan(n'7r+h) = tan(h) where -7r/2.0 < h < 7r/2.0 

tan(-x) = -tan(x) 
are used to reduce GTAN(x) to a problem with 

-tt/2.0 < x < tt/2.0 

Then n and f are defined so that: 

x = n»7r/2.0+f where -tt/4.0 < f< tt/4.0 

If f<2" 30 
tan(f) - f 
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Otherwise 

tan(f) = R(f) 

R(f) = (((((xp4-g+xp3)-g+xp2)-g+xpl)-g)-f+f)/ 
((((q4*g+q3) 'g+q2) *g+ql) 'g+1.0) 

g = f-f 
xpl = -.1372889460941120802 
xp2 = .3925934686364577602 -KT 2 
xp3 = -.2882482747560198194 'lO" 4 
xp4 = .2927308283322907641 -10" 7 

ql = -.4706222794274454135 

q2 = .2746669449551304872 -10" 1 

q3 = -.4030063705745304384' 10" 3 

q4 = .1312960309685759549" 10-5 

If n is even 

GTAN(x) = tan(f) 

If n is odd 

GTAN(x) = -l/tan(f) 

Reference 

Coefficients are derived from those given in Cody and Waite, Software Man- 
ual for the Elementary Functions, (Englewood, N.J.: Prentice Hall, 1980) for 
machines with 25-32 bit precision. 

Error Conditions 

If the absolute value of the argument is greater than 2 29, 7r/2, the following 
message is issued and the result is set to 0.0. 

GTAN: ABS(arg) too large; result = zero 
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GCOTAN 



Description 

The GCOTAN routine calculates the double-precision, G -floating-point co- 
tangent of the double-precision, G-floating-point angle given in radians as the 
argument. That is: 

GCOTAN(x) = cot(x) 

Routines Called 

GCOTAN calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, G-floating-point value less than or 
equal to 2 29 »7r/2 and greater than 2 1023 -(l+2- 58 ). 

Type of Result 

The result returned is a double-precision, G-floating-point value; it may be 
any such value. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-10.000 through 201.06 
6.46X10 18 (57.1 bits) 
1.43X10" 18 (59.3 bits) 

-2 -1 +1 +2 

1% 18% 60% 20% 1% 



Algorithm Used 

GCOTAN(x) is calculated as follows. 

If Ixl >2 29 -tt/2 

GCOTAN(x) = 0.0 

If Ixl < 2- 1023 -(l+2- 58 ) 

GCOTAN(x) = +machine infinity 

Otherwise, the identities 

tan(7r/2.0-g) = 1.0/tan(g) 

tan(n»7r+h) = tan(h) where -tt/2.0 < h < 7r/2.0 

tan(-x) = -tan(x) 

cot(x) = 1.0/tan(x) 

cot(-x) = -cot(x) 
are used to reduce GCOTAN(x) to a problem with 

-tt/2.0 < x < tt/2.0 

Then n and f are defined so that: 

x = n»7r/2.0+f where -tt/4.0 < f< tt/4.0 

If f <2 30 
tan(f) = f 
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Otherwise 

tan(f) = R(f) 

R(f) = (((((xp4-g+xp3)-g+xp2)-g+xpl)-g)-f+f)/ 
C(((q4-g+q3) -g+q2) -g+ql) -g+1.0) 

g = M 
xpl = -.1372889460941120802 

xp2 = .3925934686364577602 -10" 2 
xp3 = -.2882482747560198194 -10" 4 
xp4 = .2927308283322907641 -lCT 7 
ql = -.4706222794274454135 
q2 = .2746669449551304872-10 4 
q3 = -.4030063705745304384- 10" 3 
q4 = . 1312960309685759549 -10~ 5 

If n is even 

GCOTAN(x) = l/tan(f) 

If n is odd 

GCOTAN(x) = -tan(f) 



Reference 

Coefficients are derived from those given in Cody and Waite, Software Man- 
ual for Elementary Functions, (Englewood Cliffs, N.J.: Prentice Hall, 1980) 
for machines with 25-32 bit precision. 

Error Conditions 

1. If the absolute value of the argument is greater than 2 29 -7r/2, the following 
message is issued and the result is set to 0.0. 

GCOTAN:ABS(arg) to large; result = zero 

2. If the absolute value of the argument is less than 2~ 1023 • (l+2~ 58 ), the follow- 
ing message is issued and the result is set to + machine infinity. 

GCOTAN: Result overflow 
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Chapter 6 

Inverse Trigonometric Routines 



ASIN 



Description 

The ASIN routine calculates, in radians, the single-precision, floating-point 
arc sine of its single-precision, floating-point argument. That is: 

ASIN(x) = sin-^x) 

Routines Called 

ASIN calls the SQRT and MTHERR routines. 

Type of Argument 

The argument must be a single-precision, floating-point value in the range 
-1.0 to 1.0. 

Type of Result 

The result returned is a single-precision, floating-point value in the range 
-tt/2 to tt/2. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 1.0000 

2.56X10" 8 (25.2 bits) 

5.34xl0" 9 (27.5 bits) 

-2 -1 +1 +2 

0% 10% 83% 7% 0% 



Algorithm Used 

ASIN(x) is calculated as follows. 

Let R(z) = Z'(p0+z'(pl+z«p2))/(q0+z-(ql+z)) 
pO = .564915737 
pi = -.409490163 
p2 = 1.93496723xl0~ 2 
qO = 3.38949412 
ql = -3.98220081 

Let s = y+yR(z) 

Then, the following table gives the value of ASIN(x) depending on the 

values of x, z, and y. 



range of x 


z 


y 


ASINI 


-1.0 to -.5 


(l+x)/2 


-2Vz~ 


-(x/2h 


-.5 to 0.0 


2 
X 


-x 


-s 


0.0 to .5 


2 
X 


X 


s 


.5 to 1.0 


(l-x)/2 


-2n/z* 


• ir/2+s 



Error Conditions 

If the absolute value of the argument is greater than 1.0, the following mes- 
sage is issued and the result is set to + machine infinity. 

ASIN: ABS(arg) greater than 1.0; result = +infinity 
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ACQS 



Description 

The ACOS routine calculates, in radians, the single-precision, floating-point 
arc cosine of its single-precision, floating-point argument. That is: 

ACOS(x) = co8 _1 (x) 

Routines Called 

ACOS calls the SQRT and MTHERR routines. 

Type of Argument 

The argument must be a single-precision, floating-point value in the range 
-1.0 to 1.0. 

Type of Result 

The result returned is a single-precision, floating-point value in the range 

0.0 tO 7T. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 1.0000 

1.55X10" 8 (25.9 bits) 

3.76xl0~ 9 (28.0 bits) 

-2 -1 +1 +2 

0% 8% 83% 9% 0% 



Algorithm Used 

ACOS(x) is calculated as follows. 

Let R(z) = z-(pO+z'(pl+z'p2))/(qO+z*(ql+z)) 
pO = .564915737 
pi = -.409490163 
p2 = .93496723X10" 2 
qO = 3.38949412 
ql = -3.98220081 

Let s = y+yR(z) 

Then, the following table gives the values of ACOS(x) depending on the 

values of x, z, and y. 



range of x 


z 


y 


ACOS 


-1.0 to -.5 


(l+x)/2 


-2sjz 


7T+8 


-.5 to 0.0 


2 
X 


-x 


tt/2+s 


0.0 to .5 


2 
X 


X 


tt/2-s 


.5 to 1.0 


(l-x)/2 


-2Vz~ 


-s 



Error Conditions 

If the absolute value of the argument is greater than 1.0, the following mes- 
sage is issued and the result is set to + machine infinity. 

ACOS: ABS(arg) greater than 1.0; result = +infinity 
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DASIN 



Description 

The DASIN routine calculates, in radians, the double-precision, D-floating- 
point arc sine of its double-precision, D-floating-point argument. That is: 

DASIN(x) = sin-^x) 

Routines Called 

DASIN calls the DSQRT and MTHERR routines. 

Type of Argument 

The argument must be a double-precision, D-floating-point value in the range 
-1.0 to 1.0. 

Type of Result 

The result returned is a double-precision, D-floating-point value in the range 
-ir/2 to ir/2. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 1.0000 
8.96X10" 19 (60.0 bits) 
1.88xl0~ 19 (62.2 bits) 

-2 -1 +1 +2 

1% 25% 69% 5% 0% 



Algorithm Used 

DASIN(x) is calculated as follows. 

Let R(g) = (g'(rpl+g'(rp2+g'(rp3+g'(rp4+g'rp5)))))/ 
(q0+g*(ql+g»(q2+g*(q3+g-(q4+g))))) 
rpl = -.27368494524164255994X10 2 
rp2 = .57208227877891731407xl0 2 
rp3 = -.39688862997504877339xl0 2 
rp4 = .10152522233806463645X10 2 
rp5 = -.69674573447350646411 

qO = -.16421096714498560795X10 3 

ql = .41714430248260412556X10 3 

q2 = -.38186303361750149284X10 3 

q3 = .15095270841030604719X10 3 

q4 = -.23823859153670238830X10 2 



Let s = y+yR(g) 

Then, the following table gives the values of DASIN (x) depending on the 

values of x, z, and y. 



Inverse Trigonometric Routines 6-5 



range of x 


z 


y 


DASH 


-1.0 to -.5 


(l+x)/2 


-2VzT 


-(tt/2^ 


-.5 to 0.0 


x 2 


-x 


-s 


0.0 to .5 


x 2 


X 


s 


.5 to 1.0 


(l-x)/2 


-2sjz 


ir/2+s 



Error Conditions 

If the absolute value of the argument is greater than 1.0, the following mes- 
sage is issued and the result is set to + machine infinity. 

DASIN: ABS(arg) greater than 1.0; result = +infinity 
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DACOS 



Description 

The DACOS routine calculates, in radians, the double- precision, D-floating- 
point arc cosine of its double-precision, D-floating-point argument. That is: 

DACOS(x) = cos _1 (x) 

Routines Called 

DACOS calls the DSQRT and MTHERR routines. 

Type of Argument 

The argument must be a double-precision, D-floating-point value in the range 
-1.0 to 1.0. 

Type of Result 

The result returned is a double-precision, D-floating-point value in the range 

0.0 tO 7T. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 1.0000 
4.48X10" 19 (61.0 bits) 
1.25X10" 19 (62.8 bits) 

-2 -1 +1 +2 

0% 19% 75% 6% 0% 



Algorithm Used 

DACOS(x) is calculated as follows. 

Let R(g) = (g*(rpl+g*(rp2+g'(rp3+g'(rp4+g*rp5)))))/ 
(q0+g'(ql+g'(q2+g*(q3+g'(q4+g))))) 
rpl = -.27368494524164255994xl0 2 
rp2 = .57208227877891731407xl0 2 
rp3 = -.39688862997504877339X10 2 
rp4 = .10152522233806463645X10 2 
rp5 = -.69674573447350646411 

qO = -.16421096714498560795X10 3 

ql = .41714430248260412556xl0 3 

q2 = -.38186303361750149284X10 3 

q3 = .15095270841030604719x10 s 

q4 = -.23823859153670238830X10 2 

Let s = y+yR(g) 

Then, the following table gives the values of DACOS (x) depending on the 

values of x, z, and y. 
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range of x 


I 


y 


ACOS 


-1.0 to -.5 


(l+x)/2 


-2Vz" 


7T+8 


-.5 to 0.0 


x 2 


-x 


tt/2+s 


0.0 to .5 


x 2 


X 


tt/2-8 


.5 to 1.0 


(l-x)/2 


-2sjz 


-8 



Error Conditions 

If the absolute value of the argument is greater than 1.0, the following mes- 
sage is issued and the result is set to + machine infinity. 

DACOS: ABS(arg) greater than 1.0; result = -t-infinity 
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GASIN 



Description 

The GASIN routine calculates, in radians, the double-precision, G-floating- 
point arc sine of its double-precision, G-floating-point argument. That is: 

GASIN(x) = sin-^x) 

Routines Called 

GASIN calls the GSQRT and MTHERR routines. 

Type of Argument 

The argument must be a double-precision, G-floating-point value in the range 
-1.0 to 1.0. 

Type of Result 

The result returned is a double-precision, G-floating-point value in the range 
-tt/2 to tt/2. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 1.0000 
6.69X10" 18 (57.1 bits) 
1.54xl0 18 (59.2 bits 

-2 -1 +1 +2 

1% 26% 72% 2% 0% 



Algorithm Used 

GASIN(x) is calculated as follows. 

Let R(g) = (g'(rpl+g-(rp2+g-(rp3+g'(rp4+g-rp5)))))/ 
(q0+g*(ql+g«(q2+g'(q3+g«(q4+g))))) 
rpl = -.27368494524164255994X10 2 
rp2 = .57208227877891731407xl0 2 
rp3 = -.39688862997504877339X10 2 
rp4 = .10152522233806463645X10 2 
rp5 = -.69674573447350646411 

qO = -.16421096714498560795x10 s 

ql = .41714430248260412556x10 s 

q2 = -.38186303361750149284X10 3 

q3 = .15095270841030604719X10 3 

q4 = -.23823859153670238830xl0 2 

Let s = y+yR(g) 

Then, the following table gives the value of GASIN (x) depending on the 

values of x, z, and y. 
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range of x 


z 


y 


GASIN0 


-1.0 to -.5 


(l+x)/2 


-2Vz" 


-(tt/2+s) 


-.5 to 0.0 


x 2 


-x 


-8 


0.0 to .5 


x 2 


X 


S 


.5 to 1.0 


(l-x)/2 


-2sjz 


tt/2+s 



Error Conditions 

If the absolute value of the argument is greater than 1.0, the following mes- 
sage is issued and the result is set to + machine infinity. 

GASIN: ABS(arg) greater than 1.0; result = +infinity 
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GACOS 



Description 

The GACOS routine calculates, in radians, the double-precision, G-floating- 
point arc cosine of its double-precision, G-floating-point argument. That is: 

GACOS(x) = cos-Ux) 

Routines Called 

GACOS calls the GSQRT and MTHERR routines. 

Type of Argument 

The argument must be a double-precision, G-floating-point value in the range 
-1.0 to 1.0. 

Type of Result 

The result returned is a double-precision, G-floating-point value in the range 

0.0 tO 7T. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 1.0000 

4.18X10 18 (57.7 bits) 

1.03X10- 18 (59.8 bits) 

-2 -1 +1 +2 

0% 14% 72% 15% 0% 



Algorithm Used 

GACOS(x) is calculated as follows. 

Let R(g) = (g«(rpl+g»(rp2-i-g»(rp3+g'(rp4+gTp5)))))/ 
(q0+g*(ql+g»(q2+g»(q3+g»(q4+g))))) 
rpl = -.27368494524164255994X10 2 
rp2 = .57208227877891731407X10 2 
rp3 = -.39688862997504877339X10 2 
rp4 = .10152522233806463645X10 2 
rp5 = -.69674573447350646411 

qO - -.16421096714498560795X10 3 

ql = .41714430248260412556x10 s 

q2 - -.38186303361750149284X10 3 

q3 = .15095270841030604719X10 3 

q4 = -.23823859153670238830X10 2 

Let s = y+yR(g) 

Then the following table gives the value of GACOS(x) depending on the 

values of x, z, and y. 
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range of x 


z 


y 


GACOS(x) 


-1.0 to -.5 


(l+x)/2 


-2Vz" 


7T+S 


-.5 to 0.0 


2 
X 


-x 


tt/2+s 


0.0 to .5 


x 2 


X 


tt/2-s 


.5 to 1.0 


(l-x)/2 


-2n/z" 


-8 



Error Conditions 

If the absolute value of the argument is greater than 1.0, the following mes- 
sage is issued and the result is set to machine infinity. 

GACOS: ABS(arg) greater than 1.0; result = +infinity 
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ATAN 



Description 

The ATAN routine calculates, in radians, the single-precision, floating-point 
arc tangent of its single-precision, floating-point argument. That is: 

ATAN(x) = tan-^x) 

Routines Called 

None 

Type of Argument 

The argument must be a single-precision, floating-point value; it can be any 
such value. 

Type of Result 

The result returned is a single-precision, floating-point value in the range 
-tt/2 to tt/2. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-80.000 through 80.000 
8.07X10" 9 (26.9 bits) 
2.99xl0 9 (28.3 bits) 

-2 -1 +1 +2 

0% 1% 98% 1% 0% 



Algorithm Used 

ATAN(x) is calculated as follows. 

If x < 0.0 

ATAN(x) = -ATAN(lxl) 

If x > 0.0 

ATAN(x) = tanHXHD+tanUz) 
z = (x-XHI)/(l+x-XHI) 
XHI is chosen so that 
Izl < tan(7r/32) 

tan" 1 (XHI) is found by table lookup. It is stored as ATANHI and 
ATANLO to provide guard bits for improved accuracy. 
tan _1 (z) is evaluated by means of a polynomial approximation (see 
"Reference" below). 

If x < tan(7r/32) 

Z = X 

ATAN(x) = tan'Hz) 

If x > l/tan(7r/32) 
z = 1/x 
ATAN(x) = Tr/2-tan-Hz) 

If tan(7r/32) < x < l/tan(ir/32) 

an appropriate XHI is obtained from a table. The table contains val- 
ues for XHI for various ranges of x. 
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Reference 

The polynomial approximation used in the algorithm is formula #4901 from 
Hart et al., Computer Approximations, (New York, N.Y.: John Wiley and 
Sons, 1968). 



Error Conditions 

None 
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ATAN2 



Description 

The ATAN2 routine calculates, in radians, the single-precision, floating-point 
polar angle for the two single-precision, floating-point coordinates of a point 
in the x-y plane that are included as the arguments. That is: 

ATAN2(y,x) = tan^Cy/x) 

Routines Called 

ATAN2 calls the ATAN and MTHERR routines. 

Type of Arguments 

The arguments must be single-precision, floating-point values; they can be 
any such values provided both arguments are not zero. 

Type of Result 

The result returned is a single-precision, floating-point value in the range 
-7T to ir. 



-80.000 through 1.0000 for x 
-80.000 through 1.0000 for y 

1.46xl0" 8 (26.0 bits) 

3.08xl0" 9 (28.3 bits) 



-2 


-1 





+1 


+2 


0% 


1% 


98% 


1% 


0% 



Accuracy of Result 

test interval 
MRE 
RMS 

LSB error distribution 

Algorithm Used 

ATAN2 (y,x) is calculated as follows. 

Let u = lyl and 

v = Ixl and compute tan _1 (u,v) 

Then find ATAN2(y,x) based on the signs of y and x as follows. 

x y ATAN2(y,x) 

+ + tan - (u,v) 

+ - -tan" (u,v) 

+ -(tan" (u,v)-7r) 

tan' 1 (u,v)-tt 



Inverse Trigonometric Routines 6-15 



The reduced argument for ATAN2 is: 

z = (u/v-XHI)/(l+u/vXHI) 
This is rewritten as: 

z = (u-v-XHI)/(v+u-XHI) 
The numerator is calculated to be: 
u-v-XHI = u-VHI-XHI-VLO-XHI 
v = VHI+VLO 

VHI has, at most, 27 significant bits 

VLO has, at most, 35 significant bits 

XHI is tabulated with, at most, 13 significant bits 

This guarantees that the numerator of z is calculated exactly. 

Error Conditions 

1. If both arguments are 0.0, the following message is issued and the result is 
set to 0.0. 

ATAN2: Both arguments are zero, result = zero 

2. If y/x underflows and x is greater than 0.0, the following message is issued 
and the result is set to 0.0. 

ATAN2: Result underflow 
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DATAN 



Description 

The DATAN routine calculates, in radians, the double-precision D-floating- 
point arc tangent of its double-precision, D-floating-point argument. That is: 

DATAN(x) = tanUx) 

Routines Called 

None 

Type of Argument 

The argument must be a double-precision, D-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, D-floating-point value in the range 
-tt/2 to tt/2. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-80.000 through 80.000 
3.40X10 19 (61.3 bits) 
9.37X10" 20 (63.2 bits) 

-2 -1 +1 +2 

0% 1% 94% 5% 0% 



Algorithm Used 

DATAN(x) is calculated as follows. 

If x < 0.0 

DATAN(x) = -DATAN(lxl) 

If x > 0.0 

DATAN(x) = tan^XHD+tan ! (z) 
z - (x-XHD/U+x-XHI) 
XHI is chosen so that 
Izl < tan(7r/32) 

tan" 1 (XHI) is found by table lookup. It is stored as ATANHI and 
ATANLO to provide guard bits for improved accuracy. 
tan -1 (z) is evaluated by means of a polynomial approximation 
(see"Reference" below). 

If x < tan(7r/32) 
z = x 
DATAN(x) = tan^z) 

If x > l/tan(7r/32) 
z = 1/x 
DATAN(x) = TT^-tanHz) 

If tan(7r/32) < x < l/tan(7r/32) 

an appropriate XHI is obtained from a table. The table contains val- 
ues for XHI for various ranges of x. 
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Reference 

The polynomial approximation used in the algorithm is formula #4904 from 
Hart et al., Computer Approximations, (New York, N.Y.: John Wiley and 
Sons, 1968). 

Error Conditions 

None 
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DATAN2 



Description 

The DATAN2 routine calculates, in radians, the double-precision, D-floating- 
point polar angle for the two double-precision, D-floating-point coordinates of 
a point in the x-y plane that are included as the arguments. That is: 

DATAN2(y,x) = tanHy/x) 

Routines Called 

DATAN2 calls the DATAN and MTHERR routines. 

Type of Arguments 

The arguments must be double-precision, D-floating-point values; they can 
be any such values provided both arguments are not zero. 

Type of Result 

The result returned is a double-precision, D-floating-point value in the range 

-7T tO 7T. 



Accuracy of Result 

test interval 
MRE 
RMS 

LSB error distribution: 



-80.000 through 1.0000 for x 
-80.000 through 1.0000 for y 

5.27xl0 19 (60.7 bits) 

9.09xl0 9 (63.3 bits) 

-2 -1 +1 +2 

0% 1% 97% 2% 0% 



Algorithm Used 

DATAN2(y,x) is calculated as follows. 

Let u = lyl and 

v — Ixl and compute tan~'(u/v) 

Then find DATAN2(y,x) based on the signs of y and x as follows. 
x y DATAN2(y,x) 



+ 



tan _1 (u/v) 



+ - -tan _1 (u/v) 

+ -(tan _1 (u/v)-7r) 

tan" (u/v)-7r 
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The reduced argument for DATAN2 is: 

z = (u/v-XHI)/(l+u/vXHI) 
This is rewritten as: 

z = (u-vXHI)/(v+u-XHI) 
The numerator is calculated to be: 
u-v-XHI = u-VHI-XHI-VLO-XHI 
v = VHI+VLO 

VHI has, at most, 27 significant bits 

VLO has, at most, 35 significant bits 

XHI is tabulated with, at most, 13 significant bits 

This guarantees that the numerator of z is calculated exactly. 

Error Conditions 

1. If both arguments are 0.0, the following message is issued and the result is 
set to 0.0. 

DATAN2: Both arguments are zero, result = zero 

2. If y/x underflows and x is greater than 0.0, the following message is issued 
and the result is set to 0.0. 

DATAN2: Result underflow 
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GATAN 



Description 

The GATAN routine calculates, in radians, the double-precision, G-floating- 
point arc tangent of its double-precision, G-floating-point argument. That is: 

GATAN(x) = tan^x) 

Routines Called 

None 

Type of Argument 

The argument must be a double-precision, G-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, G-floating-point value in the range 
- tt/2 to tt/2. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



-80.000 through 80.000 
2.04X10 18 (58.8 bits) 
7.03X10 19 (60.3 bits) 

-2 -1 +1 +2 

0% 1% 97% 2% 0% 



Algorithm Used 

GATAN(x) is calculated as follows. 

If x < 0.0 

GATAN(x) = -GATAN(lxl) 

If x > 0.0 

GATAN(x) = tair^XHD+tairUz) 
z = (x-XHD/U+x-XHI) 
XHI is chosen so that 
Izl < tan(7r/32) 

tan" 1 (XHI) is found by table lookup. It is stored as ATANHI and 
ATANLO to provide guard bits for improved accuracy. 
tan -1 (z) is evaluated by means of a polynomial approximation (see 
"Reference" below). 

If x < tan(7r/32) 
z = x 
GATAN(x) = tan-!(z) 

If x > tan(ir/32) 
z = 1/x 
GATAN(x) = Tr^-tan-Uz) 

If tan(7r/32) < x < l/tan(ir/32) 

an appropriate XHI is obtained from a table. The table contains val- 
ues for XHI for various ranges of x. 
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Reference 

The polynomial approximation used in the algorithm is formula 4904 from 
Hart et al., Computer Approximations, (New York, N.Y.: John Wiley and 
Sons, 1968). 

Error Conditions 

None 
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GATAN2 



Description 

The GATAN2 routine calculates, in radians, the double-precision, G-floating- 
point polar angle for the two double-precision, G-floating-point coordinates of 
a point in the x-y plane that are included as the arguments. That is: 

GATAN2(y,x) = tanUy/x) 

Routines Called 

GATAN2 calls the GATAN and MTHERR routines. 

Type of Arguments 

The arguments must be double-precision, G-floating-point values; they can 
be any such values provided both arguments are not zero. 

Type of Result 

The result returned is a double-precision, G-floating-point value in the range 

- 7T tO 7T. 

Accuracy of Result 

. -80.000 through 1.0000 for x 
test interval: _ 80 000 through l0Q00 for y 



MRE: 


3.28x10 18 (58.1 bits) 




RMS: 


7.15X10 19 (60.3 bits) 






-2 -1 +1 


+2 


•ution: 


0% 1% 98% 2% 


0% 



LSB error distribution 

Algorithm Used 

GATAN2(y,x) is calculated as follows. 

Let u = ly! and 

v = Ixl and compute tan -1 (u/v) 

Then find GATAN2(y,x) based on the signs of y and x as follows. 

x y GATAN2(y,x) 

+ + tan _1 (u/v) 

+ - -tan _1 (u/v) 

+ -(tan'Cu/v)-^) 

tan _1 (u/v)-7r 



Inverse Trigonometric Routines 6-23 



The reduced argument for GATAN2 is: 

z = (u/v-XHI)/(l+u/vXHI) 
This is rewritten as: 

z = (u-vXHI)/(v+u-XHI) 
The numerator is calculated to be: 
u-v-XHI = u-VHI-XHI-VLO-XHI 
v = VHI+VLO 

VHI has, at most, 27 significant bits 

VLO has, at most, 35 significant bits 

XHI is tabulated with, at most, 13 significant bits 

This guarantees that the numerator of z is calculated exactly. 

Error Conditions 

1. If both arguments are 0.0, the following message is issued and the result is 
set to 0.0. 

GATAN2: Both arguments are zero, result = zero 

2. If y/x underflows and x is greater than 0.0, the following message is issued 
and the result is set to 0.0. 

GATAN2: Result underflow 



6-24 TOPS-10/TOPS-20 Common Math Library Reference Manual 



Chapter 7 
Hyperbolic Routines 



SINH 



Description 

The SINH routine calculates the single-precision, floating-point hyperbolic 
sine of its single-precision, floating-point argument. That is: 

SINH(x) = sinh(x) 

Routines Called 

SINH calls the EXP and MTHERR routines. 

Type of Argument 

The argument must be a single-precision, floating-point value in the range 
-88.722 to 88.722. 

Type of Result 

The result returned is a single-precision, floating-point value; it may be any 
such value. 

Accuracy of Result 

test interval 



0.00000 through 88.721 
2.61X10" 8 (25.2 bits) 
4.24xl0~ 9 (27.8 bits) 

-2 -1 +1 

0% 4% 85% 11% 

Algorithm Used 

SINH(x) is calculated as follows. 

The table below gives the value of SINH(x) depending upon the range of 
values for Ixl. 



MRE 
RMS 

LSB error distribution 



+2 
0% 



range of Ixl 

0.0 to 2" 13 

2" 13 to 1.0 

1.0 to 9.7 = 14*log e (2) 

9.7 to 88.03 = 127"log e (2) 

88.03 to 88.722 = 128»log e (2) 

88.722 to infinity 



SINH(x) 

x 

x # p4(x 2 ) 

(e x -e~ x )/2'sgn(x) 

e x /2*sgn(x) 
e x-log e (2). sgn(x) 

infinity *sgn(x) 



If z - x 2 

p4(z) = l+z , (cl+z*(c2+z*(c3+c4 , z))) 
cl = 1.666666643X10 1 
c2 = 8.333352593X10" 3 
c3 = 1.983581245X10 4 
c4 = 2.818523951X10" 6 

Error Conditions 

If the absolute value of the argument is greater than 88.722, the following 
message is issued and the result is set to ± machine infinity using the sign of 
the argument. 

SINH: Result overflow 
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COSH 



Description 

The COSH routine calculates the single-precision, floating-point hyperbolic 
cosine of its single-precision, floating-point argument. That is: 

COSH(x) = cosh(x) 

Routines Called 

COSH calls the EXP and MTHERR routines. 

Type of Argument 

The argument must be a single-precision, floating-point value in the range 
-88.722 to 88.722. 

Type of Result 

The result returned is a single-precision, floating-point value greater than or 
equal to 1.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 88.721 
2.12x10 s (25.5 bits) 
4.49X10" 9 (27.7 bits) 



-2 
0% 



-1 

4% 





82% 



+1 
14% 



+2 
0% 



Algorithm Used 

COSH(x) is calculated as follows. 

The table below gives the value of COSH(x) depending upon the range of 
values for Ixl. 



range of Ixl 


COSH(x) 


0.0 to 2" 14 


1.0 


2 -14 to 9.7 = 14Mog e (2) 


(e x +e" x )/2 


9.7 to 88.03 = 127'log e (2) 


e x /2 


88.03 to 88.722 = 128'log e (2) 


e x-log e (2) 


88.722 to infinity 


infinity 



Error Conditions 

If the absolute value of the argument is greater than 88.722, the following 
message is issued and the result is set to ± machine infinity using the sign of 
the argument. 

COSH: Result overflow 
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DSINH 



Description 

The DSINH routine calculates the double-precision, D-floating-point hyper- 
bolic sine of its double-precision, D-floating-point argument. That is: 

DSINH(x) = sinh(x) 

Routines Called 

DSINH calls the DEXP and MTHERR routines. 

Type of Argument 

The argument must be a double-precision, D-floating-point value in the range 
-88.722 to 88.722. 

Type of Result 

The result returned is a double-precision, D-floating-point value; it may be 
any such value. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 88.721 

6.82xl0 8 (60.3 bits) 

1.27xl0 9 (62.8 bits) 

-2 -1 +1 +2 

0% 6% 83% 11% 0% 



Algorithm Used 

DSINH(x) is calculated as follows. 

The table below gives the value of DSINH(x) depending upon the range of 
values for Ixl. 



range of Ixl 


DSINH(x) 


0.0 to 2" 31 


X 


2~ 31 to 1.0 


x+x'R(x 2 ) 


1.0 to 22.0 = 32-10^(2) 


(e x -e" x )/2«sgn(x) 


22,0 to 88.03 - 127'log e (2) 


e x /2«sgn(x) 


88.03 to 88.722 =* 128'log e (2) 


e x-log e (2). sgn(x) 


88.722 to infinity 


infinity *sgn(x) 
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If Z = X 2 

R(z) = (rpO+z-(rpl+z'(rp2+zTp3)))/(qO+z«(ql+z'(q2+z))) 
rpO =.35181283430177117881xl0 6 
rpl = .11563521196851768270X10 6 
rp2 = .16375798202630751372X10 3 
rp3 = .78966127417357099479 

qO = -.21108770058106271242X10 7 

ql = .36162723109421836460X10 6 

q2 = -.27773523119650701667X10 3 

Error Conditions 

If the absolute value of the argument is greater than 88.722, the following 
message is issued and the result is set to ± machine infinity using the sign of 
the argument. 

DSINH: Result overflow 
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DCOSH 



Description 

The DCOSH routine calculates the double-precision, D-floating-point hyper- 
bolic cosine of its double-precision, D-floating-point argument. That is: 

DCOSH(x) = cosh(x) 

Routines Called 

DCOSH calls the DEXP and MTHERR routines. 

Type of Argument 

The argument must be a double-precision, D-floating-point value in the range 
-88.722 to 88.722. 

Type of Result 

The result returned is a double-precision, D-floating-point value greater than 
or equal to 1.0. 



racy of Result 






test interval: 


0.00000 through 88.721 




MRE: 


5.90x10 19 (60.6 bits) 




RMS: 


1.34X1Q- 19 (62.7 bits) 




LSB error distribution: 


-2 -1.0 +1 

0% 5% 81% 14% 


+2 
0% 



Algorithm Used 

DCOSH(x) is calculated as follows. 

The table below gives the value of DCOSH(x) depending upon the range 
of values for Ixl. 



range of Ixl 


DCOSH(x 


0.0 to 2~ 32 


1.0 


2~ 32 to 22.0 = 32'log e (2) 


(e x +e~ x )/2 


22.0 to 88.03 = 127'log e (2) 


e x /2 


88.03 to 88.722 = 128«loge(2) 


e x~log e (2) 


88.722 to infinity 


infinity 



Error Conditions 

If the absolute value of the argument is greater than 88.722, the following 
message is issued and the result is set to ± machine infinity using the sign of 
the argument. 

DCOSH: Result overflow 
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GSINH 



Description 

The GSINH routine calculates the double-precision, G-floating-point hyper- 
bolic sine of its double-precision, G-floating-point argument. That is: 

GSINH(x) = sinh(x) 

Routines Called 

GSINH calls the GEXP and MTHERR routines. 

Type of Argument 

The argument must be a double-precision, G-floating-point value in the range 
-709.782713 to 709.782713. 

Type of Result 

The result returned is a double-precision, G-floating-point value; it may be 
any such value. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 88.721 
6.40xl(T 18 (57.1 bits) 
9.44xl0" 19 (59.9 bits) 



-2 

0% 



-1 
3% 





87% 



+1 
10% 



+2 
0% 



Algorithm Used 

GSINH(x) is calculated as follows. 

The table below gives the value of GSINH(x) depending upon the range of 
values for Ixl. 



range of Ixl 

0.0 to 2" 30 

2" 30 to 1.0 

1.0 to 22.0 = 32'log e (2) 

22.0 to 709.089565 

709.089565 to 709.782713 

709.782713 to infinity 



GSINH(x) 

x 

x+x*R(x 2 ) 

(e x -e" x )/2«sgn(x) 

e x /2«3gn(x) 
e x-lo ge (2). sgn(x) 

infinity *sgn(x) 
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If Z = X 2 

R(z) = (rp0+z , (rpl+z , (rp2+z*rp3)))/(q0+z , (ql+z*(q2+z))) 
rpO = .35181283430177117881 -10 6 
rpl = .11563521196851768270- 10 5 
rp2 = .16375798202630751372-10 3 
rp3 = .78966127417357099479 

qO = -.21108770058106271242 -10 7 

ql = .36162723109421836460* 10 5 

q2 = -.27773523119650701667-10 3 

Error Conditions 

If the absolute value of the argument is greater than 709.782713, the following 
message is issued and the result is set to + machine infinity, using the sign of 
the argument. 

GSINH: Result overflow 
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GCOSH 



Description 

The GCOSH routine calculates the double-precision, G-floating-point hyper- 
bolic cosine of its double-precision, G-floating-point argument. That is: 

GCOSH(x) = cosh(x) 

Routines Called 

GCOSH calls the GEXP and MTHERR routines. 

Type of Argument 

The argument must be a double-precision, G-floating-point value in the range 
-709.782713 to 709.782713. 

Type of Result 

The result returned is a double-precision, G-floating-point value greater than 
or equal to 1.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 88.721 
4.84xl0 18 (57.5 bits) 
l.OOxlO- 18 (59.8 bits) 



-2 
0% 



-1 

3% 





84% 



+1 
13% 



+2 
0% 



Algorithm Used 

GCOSH(x) is calculated as follows. 

The table below gives the value of GCOSH(x) depending upon the range 
of values for Ixl. 



range of Ixl 


GCOSH(x 


0.0 to 2" 30 


1.0 


2" 30 to 22.0 = 32Mog e (2) 


(e x +e- x )/2 


22.0 to 709.089565 


e72 


709.089565 to 709.782713 


e x-log e (2) 


709.782713 to infinity 


infinity 



Error Conditions 

If the absolute value of the argument is greater than 709.782713, the following 
message is issued and the result is set to ± machine infinity, using the sign of 
the argument. 

GCOSH: Result overflow 
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TANH 



Description 

The TANH routine calculates the single-precision, floating-point hyperbolic 
tangent of its single-precision, floating-point argument. That is: 

TANH(x) = tanh(x) 

Routines Called 

TANH calls the EXP routine. 

Type of Argument 

The argument must be a single-precision, floating-point value; it can be any 
such value. 

Type of Result 

The result returned is a single-precision, floating-point value in the range -1.0 
to 1.0. 



Accuracy of Result 






test interval: 


0.00000 through 90.000 




MRE: 


2.69X10 8 (25.1 bits) 




RMS: 


5.53xl0- 9 (27.4 bits) 




LSB error distribution: 


-2 -1 +1 

0% 0% 79% 21% 


+2 
0% 



Algorithm Used 

TANH(x) is calculated as follows. 

The table below gives the value of TANH(x) depending upon the range of 
values for Ixl. 



range of Ixl 




TANH(x) 


0.0 to 2~ 15 




X 


2" 15 to log e (3)/2 




x+x*R(x 2 ) 


log e (3)/2 to 9.8479016 




(l-2/(e 2 * lx 


9.8479016 to infinity 




1.0*sgn(x) 


If g = X 2 

R(g) = g»(a+b'g)/(c+g) 
a = -.823772813 
b = -.383101067xl0" 2 
c = 2.47131965 




Error Conditions 

None 







+l))*sgn(x) 
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DTANH 



Description 

The DTANH routine calculates the double-precision, D-floating-point hyper- 
bolic tangent of its double-precision, D-floating-point argument. That is: 

DTANH(x) = tanh(x) 

Routines Called 

DTANH calls the EXP routine. 

Type of Argument 

The argument must be a double-precision, D-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, D-floating-point value in the range 
-1.0 to 1.0. 

Accuracy of Result 



test interval 
MRE 
RMS 

LSB error distribution: 



0.00000 through 90.000 

7.17X10 19 (60.3 bits) 

1.75X10 19 (62.3 bits) 

-2 -1 +1 +2 

0% 0% 70% 30% 0% 



Algorithm Used 

DTANH(x) is calculated as follows. 

The table below gives the value of DTANH(x) depending upon the range 
of values for Ixl. 



range of Ixl 


DTANH(x) 


0.0 to 2" 32, V5" 


X 


2" 32 «v5'tolog e (3)/2 


x+x*R(x 2 ) 


log e (3)/2 to 22.1807100 


(l-2/(e 2 ' lxl +l))'sgn(x) 


22.1807100 to infinity 


1.0 *sgn(x) 



If g = x 2 

R(g) = g'(rp0+g*(rpl+rp2«g))/(q0+g'(ql+g'(q2+g))) 
rpO = -.161341190239962281X10 4 
rpl = -.992259296722360833X10 2 
rp2 = -.964374927772254698 
qO = .484023570719886887X10 4 
ql = .22337720718962312926X10 4 
q2 = .112744743805349493X10 3 

Error Conditions 

None 
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GTANH 



Description 

The GTANH routine calculates the double-precision, G-floating-point hyper- 
bolic tangent of its double-precision, G-floating-point argument. That is: 

GTANH(x) = tanh(x) 

Routines Called 

GTANH calls the GEXP routine. 

Type of Argument 

The argument must be a double-precision, G-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, G-floating-point value in the range 
-1.0 to 1.0. 



Accuracy of Result 

test interval 

MRE 
RMS 

LSB error distribution: 



0.00000 through 90.000 

6.44X10' 18 (57.1 bits) 

1.33X10" 18 (59.4 bits) 

-2 -1 +1 +2 

0% 0% 80% 20% 0% 



Algorithm Used 

GTANH(x) is calculated as follows. 

The table below gives the value of GTANH(x) depending upon the range 
of values for Ixl. 



range of Ixl 


GTANH(X) 


0.0to2" 32 *v§" 


X 


2~ 32 *vf tolog e (3)/2 


x+x»R(x 2 ) 


log e (3)/2 to 22.1807100 


(l-2/(e 2 " lxl +l))*sgn(x) 


22.1807100 to infinity 


1.0*sgn(x) 


If g = x 2 





R(g) = g«(rp0+g'(rpl+rp2'g))/(q0+g'(ql+g»(q2+g))) 
rpO = -.161341190239962281X10 4 
rpl = -.992259296722360833X10 2 
rp2 - -.964374927772254698 

qO = .484023570719886887X10 4 

ql = .22337720718962312926xl0 4 

q2 = .112744743805349493X10 3 



Error Conditions 

None 
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Chapter 8 

Random Number Generating Routines 



RAN 

Description 

The RAN routine returns pseudo random numbers between 0.0 and 1.0, but 
not including 0.0 or 1.0. The period of the sequence is 2147483647; that is, the 
numbers repeat every 2147483647 calls. 

RAN uses a pure multiplicative congruential random number generator with 
prime modulus. The seed value can be supplied by the system or supplied by 
a call to the SETRAN subroutine. (See SETRAN, p. 8-6). 

Routines Called 

RAN does not call any routines; but you can call the SETRAN subroutine to 
provide a seed value and the SAVRAN subroutine (see SAVRAN, p. 8-7) to 
determine the last seed used by RAN. 

Type of Argument 

The argument is a dummy value that is not used. 

Type of Result 

The result returned is a single- precision, floating-point value that is greater 
than 0.0 and less than 1.0. 

Accuracy of Result 

The independence of successive random numbers generated by multiplicative 
congruential methods can be measured by the spectral test. For this genera- 
tor, with seed 630360016 and modulus 2147483647, the spectral test yields the 
following results. 

n mu(n) bits 



2 


2.446 


15 


3 


.4766 


9 


4 


3.715 


8 


5 


4.944 


6 


6 


.8183 


5 



mu(n) measures how densely n-tuples of random numbers cover an 
n-dimensional square. 

bits is the number of independent bits in successive n-tuples of num- 

bers returned by RAN. 

For example, successive pairs of random numbers can be considered to be 
independent in their first 15 bits. The remaining 12 bits are not independent. 
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Algorithm Used 

RAN(n) is calculated as follows. 

Using a seed value supplied from a call to the SETRAN subroutine or the 
default seed value 524287(=2 19 -1), the seed value is calculated by: 
RAN(n) = seed/2 31 , truncated 

On subsequent calls to RAN, a new seed is calculated from the previous 
seed value by: 

seed = seed -630360016 mod (2 31 -1) 
and the random number is then generated. 

References 

A full description of the spectral test is given in R.R. Coveyan and R.D. 
MacPherson, Journal of the ACM 14 (1967), pp. 100-119 and in D.E. Knuth, 
Seminumerical Algorithms (Reading, Mass.: Addison- Wesley, 1981), Section 
3.3.4. 

Error Conditions 

None 
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RANS 



Description 

The RANS routine returns pseudo random numbers between 0.0 and 1.0, but 
not including 0.0 or 1.0. The period of the sequence 2484877906816; that is, the 
numbers repeat every 2484877906816 calls. 

RANS is based on the same multiplicative random number generator as RAN 
(p. 8-3). In addition, it shuffles the numbers using a 128-word table. 

Routines Called 

RANS calls the RAN and SAVRAN routines. 

Type of Argument 

The argument is a dummy value that is not used. 

Type of Result 

The result returned is a single-precision, floating point value that is greater 
than 0.0 and less than 1.0. 

Accuracy of Result 

Not applicable 

Algorithm Used 

RANS(n) is calculated as follows. 

On the initial reference to RANS, RAN is called 128 times to generate Si, 
S2,...,Si28 (uniform random deviates in (0,1)) and a new seed x () . x is 
obtained from a call to the SAVRAN subroutine (see SAVRAN, p. 8-7) 
after Si 28 has been generated. Then: 

x i+1 = 630360016 -X; mod(2 31 ~l) 
j'= (Xj,, mod(128))+l 
Sj = x^/2 3 '- 
t^ Sj 
RANS(n) = t 

Error Conditions 

None 
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SETRAN 



Description 

The SETRAN subroutine provides the internal integer seed value for the RAN 
routine. 

SETRAN is used to reset RAN to return the same sequence of random num- 
bers again, or to set RAN to an arbitrary value (such as the time of day) so 
that it will return an entirely new sequence. 

Routines Called 

SETRAN does not call any routines; but you can call the SAVRAN subrou- 
tine to save and return the last seed value used by RAN. 

Type of Argument 

The argument must be an integer value in the range to 2 31 . If the argument 
is 0, the default seed value for RAN is used. 

Type of Result 

Not applicable 

Accuracy of Result 

Not applicable 

Algorithm Used 

SETRAN(n) is calculated as follows. 

Using the value supplied, SETRAN computes: 
seed = Iseedl mod (2147483647) 

Error Conditions 

None 



8-6 TOPS-10/TOPS-20 Common Math Library Reference Manual 



SAVRAN 



Description 

The SAVRAN subroutine saves and returns the last seed used by the RAN 
routine. 

Routines Called 

None 

Type of Argument 

The argument must be an integer variable in which the seed value will be 
stored. 

Type of Result 

The result returned is an integer value between 1 and 2147483647. 

Accuracy of Result 

Not applicable 

Algorithm Used 

Not applicable 

Error Conditions 

None 
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Chapter 9 

Absolute Value Routines 



IABS 



Description 

The IABS routine returns the integer absolute value of its integer argument. 
That is: 

IABS(n) = Inl 

Routines Called 

None 

Type of Argument 

The argument must be an integer value; it can be any such value. 

Type of Result 

The result returned is an integer value greater than or equal to 0. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

IABS(n) is calculated as follows. 

If n>0 

ABS(n) = n 

If n<0 

ABS(n) = -n 

Error Conditions 

If the argument is the "most negative integer" (400000000000 8 ), overflow oc- 
curs and the result is set to machine infinity. 
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ABS 



Description 

The ABS routine returns the single-precision, floating-point absolute value of 
its single-precision, floating-point argument. That is: 

ABS(x) = Ixl 

Routines Called 

None 

Type of Argument 

The argument must be a single-precision, floating-point value; it can be any 
such value. 

Type of Result 

The result returned is a single-precision, floating-point value greater than or 
equal to 0.0. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

ABS(x) is calculated as follows. 



If 


x>0.0 




ABS(x) = x 


If 


x < 0.0 




ABS(x) = -: 


Error Conditions 
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DABS 



Description 

The DABS routine returns the double-precision, D -floating- point absolute 
value of its double-precision, D-floating-point argument. That is: 

DABS(x) = Ixl 

Routines Called 

None 

Type of Argument 

The argument must be a double-precision, D-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, D-floating-point value greater than 
or equal to 0.0. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

DABS(x) is calculated as follows. 

If x > 0.0 

DABS(x) = x 

If x < 0.0 

DABS(x) = -x 

Error Conditions 

None 
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GABS 



Description 

The GABS routine returns the double-precision, G-floating-point absolute 
value of its double-precision, G-floating-point argument. That is: 

GABS(x) = Ixl 

Routines Called 

None 

Type of Argument 

The argument must be a double-precision, G-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, G-floating-point value greater than 
or equal to 0.0. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GABS(x) is calculated as follows. 

If x > 0.0 

GABS(x) = x 

If x < 0.0 

GABS(x) = -x 

Error Conditions 

None 
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CABS 



Description 

The CABS routine returns the single-precision, floating-point absolute value 
of its complex, single-precision,' floating-point argument. That is: 

CABS(z) - Izl 

Routines Called 

CABS calls the SQRT and MTHERR routines. 

Type of Argument 

The argument must be a complex, single-precision, floating-point value; it 
can be any such value. 

Type of Result 

The result returned is a single-precision, floating-point value greater than or 
equal to 0.0. 



Accuracy of Result 

test interval 
MRE 
RMS 

LSB error distribution: 



-l.OOOOOxlO 18 through l.OOOOOxlO 18 real 
-l.OOOOOxlO 18 through l.OOOOOxlO 18 imaginary 

1.84X10 8 (25.7 bits) 

5.36xl0~ 9 (27.5 bits) 



-2 

0% 



-1 

14% 




65% 



+ 1 

21% 



+2 
0% 



Algorithm Used 

CABS(z) is calculated as follows. 

Let z = x+i»y 

v = MAX(lxUyl) 
w = MIN(lxlJyl) 



Then CABS(z) - vVi.o+(w/v) 2 

Error Conditions 

If the argument is so large that it causes an overflow, the following message is 
issued and the result is set to + machine infinity. 

CABS: Result overflow 
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CDABS 



Description 

The CDABS routine calculates the double-precision, D-floating-point abso- 
lute value of its complex, double-precision, D-floating-point argument. That 
is: 

CDABS(z) = Izl 

z = location of input value 

Routines Called 

CDABS calls the DSQRT and MTHERR routines. 

Type of Argument 

The argument must be a two-element, double-precision vector that contains 
the input value, (z). Z must be a complex, double-precision, D-floating-point 
value; it can be any such value. 

Type of Result 

The result returned is a double-precision, D-floating-point value greater than 
or equal to 0.0. 



Accuracy of Result 

test interval 
MRE 
RMS 

LSB error distribution: 



-l.OOOOOxlO 18 through l.OOOOOxlO 18 real 
-l.OOOOOxlO 18 through l.OOOOOxlO 18 imaginary 

6.32xl0" 19 (60.5 bits) 

1.89X10" 19 (62.2 bits) 

-2 -1 +1 +2 

0% 4% 56% 38% 2% 



Algorithm Used 

CDABS(z) is calculated as follows. 

Let z = x+i*y 

v = MAX(lxl,lyl) 
w = MIN(lxl.lyl) 



Then CDABS(z) = v-Vi.o+(w/v) 2 

Error Conditions 

If the argument is so large that overflow occurs, the following message is 
issued and the result is set to +machine infinity. 

CDABS: Result overflow 
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CGABS 



Description 

The CGABS routine calculates the double-precision, G-floating-point abso- 
lute value of its complex, double-precision, G-floating argument. That is: 

CGABS(z) = Izl 

z = location of input value 

Routines Called 

CGABS calls the GSQRT and MTHERR routines. 

Type of Argument 

The argument must be a two-element, double-precision vector that contains 
the input value (z). Z must be a complex, double-precision, G-floating-point 
value; it can be any such value. 

Type of Result 

The result returned is a double-precision, G-floating-point value greater than 
or equal to 0.0. 



Accuracy of Result 

test interval 
MRE 
RMS 

LSB error distribution: 



-l.OOOOOxlO 18 through l.OOOOOxlO 18 real 
-l.OOOOOxlO 18 through l.OOOOOxlO 18 imaginary 

4.88xl0 18 (57.5 bits) 

1.51xl0 18 (59.2 bits) 

-2 -1 +1 +2 

0% 4% 56% 38% 2% 



Algorithm Used 

CGABS(z) is calculated as follows. 

Let z = x+i*y 

v = MAX(lxl,lyl) 
w = MIN(lxl,lyl) 



Then CGABS(z) = v-Vi.o+(w/v)2 

Error Conditions 

If the argument is so large that overflow occurs, the following message is 
issued and the result is set to + machine infinity. 

CGABS: Result overflow 



Absolute Value Routines 9-9 



Chapter 10 

Data Type Conversion Routines 



IFIX 



Description 

The IFIX routine converts and truncates its single-precision, floating-point 
argument to an integer value. 

Routines Called 

None 

Type of Argument 

The argument must be a single-precision, floating-point value less than 2 35 . 

Type of Result 

The result returned is an integer value; it may be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

IFIX(x) is calculated by means of the FIX machine instruction. This instruc- 
tion converts and truncates the argument to an integer. 

Error Conditions 

If the argument is greater than 2 36 , an overflow occurs and the result is set to 
machine infinity. 
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INT 



Description 

The INT routine converts and truncates its single-precision, floating-point 
argument to an integer value. 

Routines Called 

None 

Type of Argument 

The argument must be a single-precision, floating-point value less than 2 35 . 

Type of Result 

The result returned is an integer value; it may be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

INT(x) is calculated by means of the FIX machine instruction. This instruc- 
tion converts and truncates the argument to an integer. 

Error Conditions 

If the argument is greater than 2 36 , an overflow occurs and the result is set to 
machine infinity. 
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IDINT 



Description 

The IDINT routine converts and truncates its double-precision, D-floating- 
point argument to an integer value. 

Routines Called 

None 

Type of Argument 

The argument must be a double-precision, D-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is an integer value; it may be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

IDINT(x) is calculated as follows. 

The routine, working on the magnitude of the argument, copies the expo- 
nent field to a scratch register. It then clears the exponent field of the 
magnitude of the argument, and uses the copy of the exponent to control a 
shift to leave the integer in the location of the result. If necessary, the 
routine negates the result. 

Error Conditions 

If the shift results in a loss of significant bits on the left, an overflow occurs 
and the result is set to machine infinity. 
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GFX.n 



Description 

The GFX.n routine converts and truncates its double-precision, G-floating- 
point argument to an integer value, n is an even octal number from 
through 14 that designates a register (AC). 

Routines Called 

None 

Calling Sequence 

GFX.n is not called like most of the other routines in the library (see Section 
1.4.1). It is called by: 

EXTEND n, GFX.n 

Type of Argument 

The argument must be a double-precision, G-floating-point value less than 
2 35 . It must be stored in the AC specified in the routine name. 

Type of Result 

The result returned is an integer value; it may be any such value. It is re- 
turned in the AC specified in the routine name. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GFX.n(x) is calculated by means of the GFIX machine instruction. This 
instruction converts and truncates the argument to an integer. 

Error Conditions 

If the argument is greater than 2 36 , an overflow occurs and the result is set to 
machine infinity. 
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REAL 



Description 

The REAL routine converts and rounds its integer argument into a single- 
precision, floating-point value. 

Routines Called 

None 

Type of Argument 

The argument must be an integer value; it can be any such value. 

Type of Result 

The result returned is a single-precision, floating-point value less than 2 35 . 

Accuracy of Result 

The result is rounded with an error bound of half a least significant bit. 

Algorithm Used 

REAL(n) is calculated by means of the FLTR machine instruction. This 
instruction converts and rounds the argument to a single-precision, floating- 
point value. 

Error Conditions 

None 
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FLOAT 



Description 

The FLOAT routine converts and rounds its integer argument to a single- 
precision, floating-point value. 

Routines Called 

None 

Type of Argument 

The argument must be an integer value; it can be any such value. 

Type of Result 

The result returned is a single-precision, floating-point value less than 2 35 . 

Accuracy of Result 

The result is rounded with an error bound of half a least significant bit. 

Algorithm Used 

FLOAT(n) is calculated by means of the FLTR machine instruction. This 
instruction converts and rounds the argument to a single-precision floating- 
point value. 

Error Conditions 

None 
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SNGL 



Description 

The SNGL routine converts and rounds its double-precision, D -floating-point 
argument to a single-precision, floating-point value. 

Routines Called 

None 

Type of Argument 

The argument must be a double-precision, D-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a single-precision, floating-point value; it may be any 
such value. 

Accuracy of Result 

The result is accurate to half a least significant bit because of rounding. 

Algorithm Used 

SNGL(x) is calculated as follows. 

The routine tests the most significant bit of the low word of the magnitude of 
the argument. 

If it is 0, the high word is returned. 

If it is 1, the low bit of the high word of the magnitude is tested. 

If it is 0, it is made 1 and negated if necessary. 

If it is 1, the high word of the magnitude is incremented and negated if 

necessary. 

Error Conditions 

If overflow occurs, the result is set to machine infinity. 
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GSN.n 



Description 

The GSN.n routine converts and rounds its double-precision, G-floating-point 
argument to a single-precision, floating-point value, n is an even octal num- 
ber from through 14 that designates a register (AC). 

Routines Called 

None 

Calling Sequence 

GSN.n is not called like most of the other routines in the library (see Section 
1.4.1). It is called by: 

EXTEND n GSN.n 

Type of Argument 

The argument must be a double-precision, G-floating-point value; it can be 
any such value. It must be stored in the AC specified in the routine name. 

Type of Result 

The result returned is a single-precision, floating-point value; it may be any 
such value. It is returned in the AC specified in the routine name. 

Accuracy of Result 

The result is exact to half a least significant bit because of rounding. 

Algorithm Used 

GSN.n(x) is calculated as follows. 

The routine tests the most significant bit of the low word of the magnitude of 
the argument. 

If it is 0, the high word is returned. 

If it is 1, the low bit of the high word of the magnitude is tested. 

If it is 0, it is made 1 and negated if necessary. 

If it is 1, the high word of the magnitude is incremented and negated if 

necessary. 

Error Conditions 

1. If overflow occurs, the result is set to machine infinity. 

2. If underflow occurs, the result is set to 0.0. 
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DFLOAT 



Description 

The DFLOAT routine converts its integer argument to a double-precision, 
D-floating-point value. 

Routines Called 

None 

Type of Argument 

The argument must be an integer value; it can be any such value. 

Type of Result 

The result returned is a double-precision, D-floating-point value less than 2 35 . 

Accuracy of Result 

The result is exact. 

Algorithm Used 

DFLOAT(n) is calculated by moving the value of the argument to the loca- 
tions used by a double-precision result. See Chapter 1 for a discussion of the 
location of the result. 

Error Conditions 

None 
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DBLE 



Description 

The DBLE routine converts its single-precision floating-point argument to a 
double-precision, D-floating-point value. 

Routines Called 

None 

Type of Argument , 

The argument must be a single-precision, floating-point value; it can be any 
such value. 

Type of Result 

The result returned is a double-precision, D-floating-point value; it may be 
any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

DBLE(x) is calculated by moving the value of the argument to the locations 
used by a double-precision result. (See Chapter 1 for a discussion of the 
location of the result.) The low order word is set to 0. 

Error Conditions 

None 
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GTOD 



Description 

The GTOD routine converts its double-precision, G-floating point argument 
to a double-precision, D-floating-point value. 

Routines Called 

GTOD calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision G-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, D-floating-point value; it may be 
any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GTOD(x) is calculated by converting the double-precision, G-floating-point 
value to double-precision, D-floating point and setting the low-order three bits 
to 0. 

Error Conditions 

1. If the resulting exponent is too small to be represented as a double- 
precision, D-floating-point number, the following message is issued and 
the result is set to 0.0. 

GTOD: Result underflow 

2. If the resulting exponent is too large to be represented as a double- 
precision, D-floating-point number, the following message is issued and 
the result is set to + machine infinity. 

GTOD: Result overflow 
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GTODA 



Description 

The GTODA subroutine converts an array of double-precision, G-floating- 
point values to an array of double-precision, D-floating-point values. It is 
called as: 

GTODA (x,y,i) 
x = input array 
y = array used for result 
i = number of elements to convert 

Routines Called 

GTODA calls the MTHERR routine. 

Type of Arguments 

GTODA is a subroutine that is called with three arguments. The first and 
second arguments must be double-precision arrays. The third argument must 
be an integer value representing the number of elements to be converted. The 
first array (x) contains the input values; the second array (y) will contain the 
results. The input values must be double-precision, G-floating-point values; 
they can be any such values. 

Type of Result 

The result returned is an array of double-precision, D-floating-point values; 
they may be any such values. They are returned in the second array (y) 
supplied in the call. 

Accuracy of Result 

The result is exact for each value converted. 

Algorithm Used 

GTODA(x) is calculated as follows. 

Using the number specified in the third argument, GTODA converts each 
double-precision, G-floating-point value to a double-precision, D-floating- 
point value and sets the low-order three bits to 0. Each converted value is 
stored in the second array. 

Error Conditions 

1. For each resulting exponent that is too small to be represented as a 
double-precision, D-floating-point number, the following message is is- 
sued and the result is set to 0.0. 

GTODA: Result underflow 

2. For each resulting exponent that is too large to be represented as a double- 
precision, D-floating-point number, the following message is issued and 
the result is set to +machine infinity. 

GTODA: Result overflow 
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GFL.n 



Description 

The GFL.n routine converts its integer argument to a double-precision, 
G-floating-point value, n is an even octal number from through 14 that 
designates a register (AC). 

Routines Called 

None 

Calling Sequence 

GFL.n is not called like most of the routines in the library (see Section 1.4.1). 
It is called by: 

EXTEND n, GFL.n 

Type of Argument 

The argument must be an integer value; it can be any such value. It must be 
stored in the AC specified in the routine name. 

Type of Result 

The result returned is a double-precision, G-floating-point value less than 2 i5 . 
It is returned in the AC specified in the routine name. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GFL.n(n) is calculated by moving the value of the argument to the locations 
used by a double-precision result (see Chapter 1). 

Error Conditions 

None 
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GDB.n 



Description 

The GDB.n routine converts its single-precision, floating-point argument to a 
double-precision, G-floating-point value, n is an even octal number from 
through 14 that designates a register (AC). 

Routines Called 

None 

Calling Sequence 

GDB.n is not called like most of the routines in the library (see Section 1.4.1). 
It is called by: 

EXTEND n, GDB.n 

Type of Argument 

The argument must be a single-precision, floating-point value; it can be any 
such value. It must be stored in the AC specified in the routine name. 

Type of Result 

The result returned is a double-precision, G-floating-point value; it may be 
any such value. It is returned in the AC specified in the routine name. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GDB.n(x) is calculated as follows. 

The routine uses the GDBLE machine instruction to convert the argument 
and move it to the locations used for double-precision results. 

Error Conditions 

None 
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DTOG 



Description 

The DTOG routine converts its double-precision, D-floating-point argument 
to a double-precision, G-floating-point value. 

Routines Called 

None 

Type of Argument 

The argument must be a double-precision, D-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, G-floating-point value; it may be 
any such value. 

Accuracy of Result 

The result is rounded with an error bound of half a least significant bit. 

Algorithm Used 

DTOG(x) is calculated by converting the double-precision, D-floating-point 
value to a double-precision, G-floating-point value and rounding the con- 
verted value. 

Error Conditions 

None 
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DTOGA 



Description 

The DTOGA subroutine converts an array of double-precision, D-floating- 
point values to an array of double-precision, G-floating-point values. It is 
called as: 

DTOGA(x,y,i) 
x = input array 
y = array used for result 
i = number of elements to convert 

Routines Called 

None 

Type of Arguments 

DTOGA is a subroutine that is called with three arguments. The first and 
second arguments must be double-precision arrays. The third argument must 
be an integer value representing the number of elements to be converted. The 
first array (x) contains the input values; the second array (y) will contain the 
result. The input values must be double-precision, D-floating-point values; 
they can be any such values. 

Type of Result 

The result returned is an array of double-precision, G-floating-point values; 
they may be any such values. They are returned in the second array (y) 
supplied in the call. 

Accuracy of Result 

Each element of the result is rounded with an error bound of half a least 
significant bit. 

Algorithm Used 

DTOGA(x) is calculated as follows. 

Using the number specified in the third argument, DTOGA converts each 
double-precision, D-floating-point value to a double-precision, G-floating- 
point value and rounds the converted value. Each converted value is 
stored in the second array. 

Error Conditions 

None 



10-18 TOPS-10/TOPS-20 Common Math Library Reference Manual 



CMPL.I 



Description 

The CMPL.I routine converts its two integer arguments into a complex, 
single-precision, floating-point value. 

Routines Called 

None 

Type of Arguments 

Both arguments must be integer values; they can be any such values. 

Type of Result 

The result returned is a complex, single-precision, floating-point value; it may 
be any such value. 

Accuracy of Result 

The result is rounded with an error bound of half a least significant bit for 
each part (real and imaginary). 

Algorithm Used 

CMPL.I(n,m) is calculated as follows. 

The two arguments are converted to single-precision, floating-point values 
using the FLTR machine instructions. These values are then moved to the 
locations where the result is stored as a complex value (see Chapter 1). 
The first argument is used as the real part of the complex number and the 
second argument as the imaginary part. 

Error Conditions 

None 
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CMPLX 



Description 

The CMPLX routine converts two single-precision arguments into one com- 
plex single-precision, floating-point value. 

Routines Called 

None 

Type of Arguments 

Both arguments must be single-precision, floating-point values; they can be 
any such values. 

Type of Result 

The result returned is a complex, single-precision, floating-point value; it may 
be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

CMPLX(x,y) is calculated by moving the arguments to the locations used for 
a complex result (see Chapter 1). The first argument is used as the real part of 
the complex number and the second argument as the imaginary part. 

Error Conditions 

None 
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CMPL.D 



Description 

The CMPL.D routine converts its two double-precision, D-floating-point ar- 
guments into a complex, single-precision, floating-point value. 

Routines Called 

None 

Type of Arguments 

The arguments must be double-precision, D-floating-point values; they can 
be any such values. 

Type of Result 

The result returned is a complex, single-precision, floating-point value; it may 
be any such value. 

Accuracy of Result 

The result is accurate to half a least significant bit for each part because of 
rounding. 

Algorithm Used 

CMPL.D(x,y) is calculated by converting the arguments to single-precision 
and then moving them to the locations used for the real and imaginary parts 
of the complex result (see Chapter 1). The first argument is used as the real 
part of the complex number and the second argument as the imaginary part. 

Error Conditions 

If overflow occurs on the conversions, the result is set to machine infinity for 
either or both of the parts of the result. 
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CMPL.G 



Description 

The CMPL.G routine converts its two double-precision, G-floating-point ar- 
guments into a complex, single-precision, floating-point value. 

Routines Called 

None 

Type of Arguments 

The arguments must be double-precision, G-floating-point values; they can 
be any such values. 

Type of Result 

The result returned is a complex, single-precision, floating-point value; it may 
be any such value. 

Accuracy of Result 

The result is accurate to half a least significant bit for each part because of 
rounding. 

Algorithm Used 

CMPL.G(x,y) is calculated by converting the arguments to single-precision 
and then moving them to the locations used for the real and imaginary parts 
of the complex result (see Chapter 1). The first argument is used as the real 
part of the complex number and the second argument as the imaginary part. 

Error Conditions 

1. If overflow occurs on the conversions, the result is set to machine infinity 
for either or both of the parts of the result. 

2. If underflow occurs on the conversions, the result is set to 0.0 for either or 
both parts of the result. 
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CMPL.C 



Description 

The CMPL.C routine creates a complex, single-precision, floating-point value 
from the real parts of two complex, single-precision, floating-point values. 

Routines Called 

None 

Type of Arguments 

The arguments must be complex, single-precision, floating-point values; they 
can be any such values. 

Type of Result 

The result returned is a complex, single-precision, floating-point value; it may 
be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

CMPL.C(z,g) is calculated by moving the arguments to the locations used for 
a complex result (see Chapter 1). The first argument is used as the real part of 
the complex number and the second argument as the imaginary part. 

Error Conditions 

None 
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Chapter 11 

Rounding and Truncation Routines 



NINT 



Description 

The NINT routine rounds its single-precision, floating-point argument to the 
nearest integer. 

Routines Called 

NINT calls the MTHERR routine. 

Type of Argument 

The argument must be a single-precision, floating-point value; it can be any 
such value. 

Type of Result 

The result returned is an integer value; it may be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

NINT(x) is calculated as follows. 

Let j = INT(lxl+.5) 

If j < 2 35 and 
If x > 0.0 

NINT(x) = j 
If x < 0.0 

NINT(x) = -j 

If j = 2 35 and 
If x < 0.0 

NINT(x) = -j 

Otherwise, overflow occurs and 
If x > 0.0 

NINT(x) = 2 35 -l 
If x < 0.0 

NINT(x) = -2 35 

Error Conditions 

If x is greater than or equal to 2 35 or less than -2 35 , the result overflows. When 
overflow occurs, the following message is issued and the result is set to +ma- 
chine infinity if x is greater than 0.0 or to -machine infinity if x is less than 
0.0. 

NINT: Result overflow 
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IDNINT 



Description 

The IDNINT routine rounds its double-precision, D-floating-point argument 
to the nearest integer. 

Routines Called 

IDNINT calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, D-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is an integer value; it may be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

IDNINT(x) is calculated as follows. 

Let j = INT(lxl+.5) 

If j<2 35 and 
If x > 0.0 

IDNINT(x) = j 
If x < 0.0 

IDNINT(x) = -j 

If j = 2 35 and 
If x < 0.0 

IDNINT(x) = -j 

Otherwise, overflow occurs and 
If x > 0.0 

IDNINT(x) = 2 35 -l 
If x < 0.0 

IDNINT(x) = -2 35 

Error Conditions 

If x is greater than or equal to 2 35 or less than -2 35 , the result overflows. When 
overflow occurs, the following message is issued and the result is set to +ma- 
chine infinity if x is greater than 0.0 or to -machine infinity if x is less than 
0.0. 

IDNINT: Result overflow 
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IGNIN. 



Description 

The IGNIN. routine rounds its double-precision, G-floating-point argument 
to the nearest integer. 

Routines Called 

IGNIN. calls the MTHERR routine. 

Type of Argument 

The argument must be a double-precision, G-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is an integer value; it may be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

IGNIN. (x) is calculated as follows. 

Let j = INT(lxl + .5) 

If j<2 35 and 
If x > 0.0 

IGNIN.(x) = j 
If x < 0.0 

IGNIN.(x) = -j 

If j = 2 35 and 
If x < 0.0 

IGNIN.(x) = -j 

Otherwise, overflow occurs and 
If x > 0.0 

IGNIN.(x) = 2 35 -l 
If x < 0.0 

IGNIN.(x) = -2 35 

Error Conditions 

If x is greater than or equal to 2 35 or less than -2 35 , the result overflows. When 
overflow occurs, the following message is issued and the result is set to +ma- 
chine infinity if x is greater than 0.0 or - machine infinity if x is less than 0.0. 

IGNIN.: Result overflow 
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ANINT 



Description 

The ANINT routine rounds its single-precision, floating-point argument to 
the nearest single-precision, floating-point whole number. 

Routines Called 

None 

Type of Argument 

The argument must be a single-precision, floating-point value; it can be any 
such value. 

Type of Return 

The result returned is a single-precision, floating-point whole value; it may be 
any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

ANINT(x) is calculated as follows. 

If Ixl > 2 26 

ANINT(x) = x because x is an integer 

If Ixl < 2 26 
If x > 0.0 

ANINT(x) = ((lxl+2 26 )rounded)-2 26 

If x < 0.0 

ANINT(x) = -(((lxl+2 26 )rounded)-2 26 ) 

Error Conditions 

None 
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DNINT 



Description 

The DNINT routine rounds its double-precision, D -floating- point argument 
to the nearest double-precision, D-floating-point whole number. 

Routines Called 

None 

Type of Argument 

The argument must be a double-precision, D-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, D-floating-point whole value; it 
may be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

DNINT is calculated as follows. 

If lxl>2 61 

DNINT(x) = x because x is an integer 

If lxl<2 61 
If x > 0.0 

DNINT(x) = ((lxl+2 61 )rounded)-2 61 

If x < 0.0 

DNINT(x) = -(((lxl+2 61 )rounded)-2 61 ) 

Error Conditions 

None 
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GNINT. 



Description 

The GNINT. routine rounds its double-precision, G-floating-point argument 
to the nearest double-precision, G-floating-point whole number. 

Routines Called 

None 

Type of Argument 

The argument must be a double-precision, G-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, G-floating-point whole value; it 
may be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GNINT. (x) is calculated as follows. 

If lxl>2 58 

GNINT. (x) = x because x is an integer 

If Ixl < 2 58 
If x > 0.0 

GNINT.(x) = ((lxl+2 58 )rounded)-2 58 

If x < 0.0 

GNINT.(x) = -(((lxl+2 58 )rounded)-2 58 ) 

Error Conditions 

None 
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AINT 



Description 

The AINT routine truncates its single-precision, floating-point argument to a 
single-precision, floating-point whole number. 

Routines Called 

None 

Type of Argument 

The argument must be a single-precision, floating-point value; it can be any 
such value. 

Type of Result 

The result returned is a single-precision, floating-point whole value; it may be 
any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

AINT(x) is calculated as follows. 

If lxl>2 26 

AINT(x) = x because x is an integer 

If Ixl < 2 26 
If x > 0.0 

AINT(x) = ((lxl+2 26 )truncated)-2 26 

If x < 0.0 

AINT(x) = -(((lxl+2 26 )truncated)-2 26 ) 

Error Conditions 

None 
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DINT 



Description 

The DINT routine truncates its double-precision, D-floating-point argument 
to a double-precision, D-floating-point whole number. 

Routines Called 

None 

Type of Argument 

The argument must be a double-precision, D-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, D-floating-point whole value; it 
may be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

DINT(x) is calculated as follows. 

If lxl>2 61 

DINT(x) = x because x is an integer 

If Ixl < 1.0 

DINT(x) = 0.0 

Otherwise 

DINT(x) = sgn(x)»(lxl with fraction bits replaced by zeroes) 

Error Conditions 

None 
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GINT. 



Description 

The GINT. routine truncates its double-precision, G-floating-point argument 
to a double-precision, G-floating-point whole number. 

Routines Called 

None 

Type of Argument 

The argument must be a double-precision, G-floating-point value; it can be 
any such value. 

Type of Result 

The result returned is a double-precision, G-floating-point whole value; it 
may be any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GINT.(x) is calculated as follows. 

If lxl>2 58 

GINT.(x) = x because x is an integer 

If Ixl < 1.0 

GINT.(x) = 0.0 

Otherwise 

GINT.(x) = sgn(x)*(lxl with fraction bits replaced by zeroes) 

Error Conditions 

None 
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Chapter 12 

Product, Remainder, and Positive Difference 
Routines 



DPROD 



Description 

The DPROD routine multiplies two single-precision, floating-point numbers 
and returns a double-precision, D -floating- point product. That is: 

DPROD(x,y) = x«y 

Routines Called 

DPROD calls the MTHERR routine. 

Type of Arguments 

Both arguments must be single-precision, floating-point values; they can be 
any such values. 

Type of Result 

The result returned is a double-precision, D-floating- point value; it may be 
any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

DPROD(x,y) is calculated as follows. 

Let x = DBLE(x) 
y = DBLE(y) 

DPROD(x,y) = x-y 

Error Conditions 

1. If overflow occurs, the following message is issued and the result is set to 
± machine infinity. 

DPROD: Result overflow 

2. If underflow occurs, the following message is issued and the result is set to 
0.0. 

DPROD: Result underflow 
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GPROD. 



Description 

The GPROD. routine multiplies two single-precision, floating-point numbers 
and returns a double-precision, G-floating-point product. That is: 

GPROD.(x,y) = x-y 

Routines Called 

GPROD. calls the MTHERR routine. 

Type of Arguments 

Both arguments must be single-precision, floating-point values; they can be 
any such values. 

Type of Result 

The result returned is a double-precision, G-floating-point value; it may be 
any such value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GPROD. (x,y) is calculated as follows. 

Let x = GDB.O(x) 
y = GDB.O(y) 

GPROD.(x,y) = x-y 

Error Conditions 

None 
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MOD 



Description 

The MOD routine returns the integer remainder of the quotient of its integer 
arguments. That is: 

MOD(i,j) = i-[i/j]-j 

Routines Called 

None 

Type of Arguments 

Both arguments must be integer; the second argument cannot equal zero. If 
the first argument is negative, the result is negative. 

Type of Result 

The result returned is an integer value in the range -Ijl to Ijl. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

MOD(iJ) is calculated as follows. 

MOD(i,j) = (lil-[lil/j]-j)-agn(i) 

[lil/j] = the greatest integer in lil/j 

Error Conditions 

None 
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AMOD 



Description 

The AMOD routine returns the single-precision, floating-point remainder of 
the quotient of its single-precision, floating-point arguments. That is: 

AMOD(x,y) = x-[x/y]-y 

Routines Called 

AMOD calls the MTHERR routine. 

Type of Arguments 

Both arguments must be single-precision, floating-point values; the second 
argument cannot equal zero. If the first argument is negative, the result will 
be negative. 

Type of Result 

The result returned is a single-precision, floating-point value in the range - lyl 

to lyl. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

AMOD(x,y) is calculated as follows. 

AMOD(x,y) = (lxl-[lxl/y]-y)-sgn(x) 
[Ixl/y] = largest integer in Ixl/y 

Error Conditions 

Underflow may occur if y is too small a number. If underflow occurs, the 
following message is issued and the result is set to 0.0. 

AMOD: Result underflow 
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DMOD 



Description 

The DMOD routine returns the double-precision, D-floating-point remainder 
of the quotient of its double-precision, D-floating-point arguments. That is: 

DMOD(x,y) = x-[x/y]-y 

Routines Cailed 

DMOD calls the MTHERR routine. 

Type of Arguments 

Both arguments must be double-precision, D-floating-point values; the sec- 
ond argument cannot equal zero. If the first argument is negative, the result 
will be negative. 

Type of Result 

The result returned is a double-precision, D-floating-point value in the range 
- lyl to lyl. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

DMOD(x,y) is calculated as follows. 

DMOD(x,y) = (lxl-[lxl/y]-y)-sgn(x) 
[Ixl/y] = largest integer in Ixl/y 

Error Conditions 

Underflow may occur if y is too small a number. If underflow occurs, the 
following message is issued and the result is set to 0.0. 

DMOD: Result underflow 
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GMOD 



Description 

The GMOD routine returns the double-precision, G-floating-point remainder 
of the quotient of its double-precision, G-floating-point arguments. That is: 

GMOD(x,y) = x-[x/y]-y 

Routines Called 

GMOD calls the MTHERR routine. 

Type of Arguments 

Both arguments must be double-precision, G-floating-point values; the sec- 
ond argument cannot equal zero. If the first argument is negative, the result 
will be negative. 

Type of Result 

The result returned is a double-precision, G-floating-point value in the range 

- lyl to lyl. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GMOD(x,y) is calculated as follows. 

GMOD(x,y) --= (lxl-[lxl/y]-y)-sgn(x) 
[Ixl/y] = largest integer in Ixl/y 

Error Conditions 

Underflow may occur if y is too small a number. If underflow occurs, the 
following message is issued and the result is set to 0.0. 

GMOD: Result underflow 
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IDIM 



Description 

The IDIM routine returns the integer difference between its integer argu* 
ments, provided that the difference is positive. If the difference is negative, 
IDIM returns zero. That is: 

IDIM(iJ) = i-j 

Routines Called 

IDIM calls the MTHERR routine. 

Type of Arguments 

Both arguments niust be integer values; they can be any such values. 

Type of Result 

The result returned is an integer value greater than or equal to 0. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

IDIM is calculated as follows. 

If i<j 

IDIM(iJ) = 

Ifi>j 

IDIM(i,j) = i-j 

Error Conditions 

If overflow occurs during subtraction, the following message is issued and the 
result is set to machine infinity. 

IDIM: Result overflow 
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DIM 



Description 

The DIM routine returns the single-precision, floating-point difference be- 
tween its single-precision, floating-point arguments, provided that the differ- 
ence is positive. If the difference is negative, DIM returns zero. That is: 

DIM(x,y) = x-y 

Routines Called 

DIM calls the MTHERR routine. 

Type of Arguments 

Both arguments must be single-precision, floating-point values; they can be 
any such values. 

Type of Result 

The result returned is a single-precision, floating-point value greater than or 
equal to 0.0. 

Accuracy of Result 

The result is rounded with an error bound of half a least significant bit. 

Algorithm Used 

DIM(x,y) is calculated as follows. 



If x<y 






DIM(x 


,y) = 


= 0.0 


If x>y 






DIM(x 


,y) = 


: x-y 


Error Conditions 





1. If overflow occurs during subtraction, the following message is issued and 
the result is set to machine infinity. 

DIM: Result overflow 

2. If underflow occurs during subtraction, the following message is issued 
and the result is set to 0.0. 

DIM: Result underflow 



12-10 TOPS-10/TOPS-20 Common Math Library Reference Manual 



DDIM 



Description 

The DDIM routine returns the double-precision, D-floating-point difference 
between its double-precision, D-floating-point arguments, provided that the 
difference is positive. If the difference is negative, DDIM returns zero. That is: 

DDIM(x,y) = x-y 

Routines Called 

DDIM calls the MTHERR routine. 

Type of Arguments 

Both arguments must be double-precision, D-floating-point values; they can 
be any such values. 

Type of Result 

The result returned is a double-precision, D-floating-point value greater than 
or equal to 0.0. 

Accuracy of Result 

The result is rounded with an error bound of half a least significant bit. 

Algorithm Used 

DDIM(x,y) is calculated as follows. 

If x<y 

DDIM(x,y) = 0.0 

If x >y 

DDIM(x,y) = x-y 

Error Conditions 

1. If overflow occurs during subtraction, the following message is issued and 
the result is set to machine infinity. 

DDIM: Result overflow 

2. If underflow occurs during subtraction, the following message is issued 
and the result is set to 0.0. 

DDIM: Result underflow 
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GDIM 



Description 

The GDIM routine returns the double-precision, G-floating-point difference 
between its double-precision, G-floating-point arguments, provided that the 
difference is positive. If the difference is negative, GDIM returns zero. That is: 

GDIM(x,y) = x-y 

Routines Called 

GDIM calls the MTHERR routine. 

Type of Arguments 

Both arguments must be double-precision, G-floating-point values; they can 
be any such values. 

Type of Result 

The result returned is a double-precision, G-floating-point value greater than 
or equal to 0.0. 

Accuracy of Result 

The result is rounded with an error bound of half a least significant bit. 

Algorithm Used 

GDIM(x,y) is calculated as follows. 

If x <y 

GDIM(x,y) = 0.0 

If x>y 

GDIM(x,y) = x-y 

Error Conditions 

1. If overflow occurs during subtraction, the following message is issued and 
the result is set to machine infinity. 

GDIM: Result overflow 

2. If underflow occurs during subtraction, the following message is issued 
and the result is set to 0.0. 

GDIM: Result underflow 
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Chapter 13 

Transfer of Sign Routines 



ISIGN 



Description 

The ISIGN routine transfers the sign of its integer second argument to its 
integer first argument, ignoring the sign of the first argument. That is: 

ISIGN (i,j) = lil-sgn(j) 

Routines Called 

ISIGN calls the MTHERR routine. 

Type of Arguments 

Both arguments must be integer values; they can be any such values. 

Type of Result 

The result returned is an integer value; it has the same magnitude as the first 
argument. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

ISIGN(iJ) is calculated as follows. 

ISIGN(i,j) = lil-8gn(j) 

If j>0 

ISIGN(iJ) = lil 

If j <0 

ISIGN(iJ) = -lil 

Error Conditions 

If i = -2 35 and j > 0, overflow occurs. If overflow occurs, the following message 
is issued and the result is set to machine infinity. 

ISIGN: Result overflow 
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SIGN 



Description 

The SIGN routine transfers the sign of its single-precision, floating-point 
second argument to its single-precision, floating-point first argument, ignor- 
ing the sign of the first argument. That is: 

SIGN (x,y) = Ixl-sgn(y) 

Routines Called 

None 

Type of Arguments 

Both arguments must be single-precision, floating-point values; they can be 
any such values. 

Type of Result 

The result returned is a single-precision, floating-point value; it has the same 
magnitude as the first argument. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

SIGN(x,y) is calculated as follows. 

SIGN(x,y) = lxl-sgn(y) 

If y > 0.0 

SIGN(x,y) = Ixl 

If y < 0.0 

SIGN(x,y) = -Ixl 

Error Conditions 

None 
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DSIGN 



Description 

The DSIGN routine transfers the sign of its double-precision, D-floating-point 
second argument to its double-precision, D-floating-point first argument, ig- 
noring the sign of the first argument. That is: 

DSIGN(x,y) = Ixl-sgn(y) 

Routines Called 

None 

Type of Arguments 

Both arguments must be double-precision, D-floating-point values; they can 
be any such values. 

Type of Result 

The result returned is a double-precision, D-floating-point value; it has the 
same magnitude as the first argument. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

DSIGN(x,y) is calculated as follows. 

DSIGN(x,y) = lxl-sgn(y) 

If y > 0.0 

DSIGN(x,y) = Ixl 

If y < 0.0 

DSIGN(x,y) = -Ixl 

Error Conditions 

None 
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GSIGN 



Description 

The GSIGN routine transfers the sign of its double-precision, G-floating-point 
second argument to its double-precision, G-floating-point first argument, ig- 
noring the sign of the first argument. That is: 

GSIGN(x,y) = Ixl-sgn(y) 

Routines Called 

None 

Type of Arguments 

Both arguments must be double-precision, G-floating-point values; they can 
be any such values. 

Type of Result 

The result returned is a double-precision, G-floating-point value; it has the 
same magnitude as the first argument. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GSIGN(x,y) is calculated as follows. 

GSIGN(x,y) = Ixl-sgn(y) 

If y > 0.0 

GSIGN(x,y) = Ixl 

If y < 0.0 

GSIGN(x,y) = -Ixl 

Error Conditions 

None 
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Chapter 14 
Maximum/Minimum Routines 



MAXO 



Description 

The MAXO routine finds the integer maximum of a series of integer argu- 
ments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be integer values; they can be any such values. There 
can be as many arguments as desired. 

Type of Result 

The result returned is an integer value; it is the largest value in the series. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

MAX0(i,...j) is calculated as follows. 

The MAXO routine compares each argument in succession with the current 
largest argument, which is held in a register. Each time an argument exceeds 
the current largest argument, the register is updated. This loop continues 
until the final argument is processed. The contents of the register are then 
returned as the result. 

Error Conditions 

None 
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MAX1 



Description 

The MAX1 routine finds the integer maximum of a series of single-precision, 
floating-point arguments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be single-precision, floating-point values; they can be 
any such values. There can be as many arguments as desired. 

Type of Result 

The result returned is the largest value in the series converted to integer 
format. 

Accuracy of Result 

The result is exact except for possible overflow during the conversion to inte- 
ger. 

Algorithm Used 

MAXl(x,...y) is calculated as follows. 

The MAX1 routine compares each argument in succession with the current 
largest argument, which is held in a register. Each time an argument exceeds 
the current largest argument, the register is updated. This loop continues 
until the final argument is processed. The contents of the register are then 
converted to integer format and returned as the result. 

Error Conditions 

Overflow can occur during conversion to integer. If overflow occurs, the result 
is set to ± machine infinity. 
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AMAXO 



Description 

The AMAXO routine finds the single-precision, floating-point maximum of a 
series of integer arguments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be integer; they can be any such values. There can be 
as many arguments as desired. 

Type of Result 

The result returned is the largest value in the series converted to single - 
precision, floating-point format. 

Accuracy of Result 

The result is exact unless a rounding error occurs during conversion, in which 
case the error could be half a least significant bit. 

Algorithm Used 

AMAX0(i,...j) is calculated as follows. 

The AMAXO routine compares each argument in succession with the current 
largest argument, which is held in a register. Each time an argument exceeds 
the current largest argument, the register is updated. This loop continues 
until the final argument is processed. The contents of the register are then 
converted to single-precision, floating-point format and returned as the result. 

Error Conditions 

None 
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AMAX1 



Description 

The AMAX1 routine finds the single-precision, floating-point maximum of a 
series of single-precision, floating-point arguments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be single-precision, floating-point values; they can be 
any such values. There can be as many arguments as desired. 

Type of Result 

The result returned is a single-precision, floating-point value; it is the largest 
value in the series. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

AMAXl(x,...y) is calculated as follows. 

The AMAX1 routine compares each argument in succession with the current 
largest argument, which is held in a register. Each time an argument exceeds 
the current largest argument, the register is updated. This loop continues 
until the final argument is processed. The contents of the register are then 
returned as the result. 

Error Conditions 

None 
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DMAX1 



Description 

The DMAX1 routine finds the double-precision, D-floating-point maximum 
of a series of double-precision, D-floating-point arguments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be double-precision, D-floating-point values; they can 
be any such values. There can be as many arguments as desired. 

Type of Result 

The result returned is a double-precision, D-floating-point value; it is the 
largest value in the series. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

DMAXl(x,...y) is calculated as follows. 

The DMAX1 routine compares each argument in succession with the current 
largest argument, which is held in two registers. Each time an argument 
exceeds the current largest argument, the registers are updated. This loop 
continues until the final argument is processed. The contents of the registers 
are then returned as the result. 

Error Conditions 

None 
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GMAX1 



Description 

The GMAX1 routine finds the double-precision, G -floating-point maximum 
of a series of double-precision, G-floating-point arguments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be double-precision, G-floating-point values; they can 
be any such values. There can be as many arguments as desired. 

Type of Result 

The result returned is a double-precision, G-floating-point value; it is the 
largest value in the series. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GMAXl(x,...y) is calculated as follows. 

The GMAX1 routine compares each argument in succession with the current 
largest argument, which is held in two registers. Each time an argument 
exceeds the current largest argument, the registers are updated. This loop 
continues until the final argument is processed. The contents of the registers 
are then returned as the result. 

Error Conditions 

None 
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MINO 



Description 

The MINO routine finds the integer minimum of a series of integer arguments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be integer values; they can be any such values. There 
can be as many arguments as desired. 

Type of Result 

The result returned is an integer value; it is the smallest value in the series. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

MIN0(i,...j) is calculated as follows. 

The MINO routine compares each argument in succession to the current 
smallest argument, which is held in a register. Each time an argument is less 
than the current smallest argument, the register is updated. This loop contin- 
ues until the final argument is processed. The contents of the register are then 
returned as the result. 

Error Conditions 

None 
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MINI 



Description 

The MINI routine finds the integer minimum of a series of single-precision, 
floating-point arguments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be single-precision, floating-point values; they can be 
any such values. There can be as many arguments as desired. 

Type of Result 

The result returned is the smallest value in the series converted to integer 
format. 

Accuracy of Result 

The result is exact except for possible overflow during the conversion to inte- 
ger. 

Algorithm Used 

MINl(x,...y) is calculated as follows. 

The MINI routine compares each argument in succession with the current 
smallest argument, which is held in a register. Each time an argument is 
smaller than the current smallest argument, the register is updated. This loop 
continues until the final argument is processed. The contents of the register 
are then converted to integer and returned as the result. 

Error Conditions 

Overflow can occur during conversion to integer. If overflow occurs, the result 
is set to ± machine infinity. 
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AMINO 



Description 

The AMINO routine finds the single-precision, floating-point minimum of a 
series of integer arguments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be integer; they can be any such values. There can be 
as many arguments as desired. 

Type of Result 

The result returned is the smallest value in the series converted to single- 
precision, floating-point format. 

Accuracy of Result 

The result is exact unless a rounding error occurs during conversion, in which 
case the error could be half a least significant bit. 

Algorithm Used 

AMINO (i,...j) is calculated as follows. 

The AMINO routine compares each argument in succession with the current 
smallest argument, which is held in a register. Each time an argument is 
smaller than the current smallest argument, the register is updated. This loop 
continues until the final argument is processed. The contents of the register 
are then converted to single-precision, floating-point format and returned as 
the result. 

Error Conditions 

None 
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AMIN1 



Description 

The AMINl routine finds the single-precision, floating-point minimum of a 
series of single-precision, floating-point arguments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be single-precision, floating-point values; they can be 
any such values. There can be as many arguments as desired. 

Type of Result 

The result returned is a single-precision, floating-point value; it is the small- 
est value in the series. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

AMINl(x,...y) is calculated as follows. 

The AMINl routine compares each argument in succession with the current 
smallest argument, which is held in a register. Each time an argument is 
smaller than the current smallest argument, the register is updated. This loop 
continues until the final argument is processed. The contents of the register 
are then returned as the result. 

Error Conditions 

None 
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DMIN1 



Description 

The DMIN1 routine finds the double-precision, D-floating-point minimum of 
a series of double-precision, D-floating-point arguments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be double-precision, D-floating-point values; they can 
be any such values. There can be as many arguments as desired. 

Type of Result 

The result returned is a double-precision, D-floating-point value; it is the 
smallest value in the series. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

DMINl(x,...y) is calculated as follows. 

The DMIN1 routine compares each argument in succession with the current 
smallest argument, which is held in two registers. Each time an argument is 
less than the current smallest argument, the registers are updated. This loop 
continues until the final argument is processed. The contents of the registers 
are then returned as the result. 

Error Conditions 

None 
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GMIN1 



Description 

The GMIN1 routine finds the double-precision, G-floating-point minimum of 
a series of double-precision, G-floating-point arguments. 

Routines Called 

None 

Type of Arguments 

All the arguments must be double-precision, G-floating-point values; they can 
be any such values. There can be as many arguments as desired. 

Type of Result 

The result returned is a double-precision, G-floating-point value; it is the 
smallest value in the series. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

GMINl(x,...y) is calculated as follows. 

The GMIN1 routine compares each argument in succession with the current 
smallest argument, which is held in two registers. Each time an argument is 
less than the current smallest argument, the registers are updated. This loop 
continues until the final argument is processed. The contents of the registers 
are then returned as the result. 

Error Conditions 

None 
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Chapter 15 

Miscellaneous Complex Routines 



REAL.C 



Description 

The REAL.C routine returns the real part of a complex number. That is: 

REAL.C(z) = REAL.C(x+i-y) = x 

Routines Called 

None 

Type of Argument 

The argument must be a complex value; it can be any such value. 

Type of Result 

The result returned is a single-precision, floating-point value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

REAL.C(z) is calculated by copying the real part of the argument to the 
return location. 

Error Conditions 

None 
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AIMAG 



Description 

The AIMAG routine returns the imaginary part of a complex number. That is: 

AIMAG(z) = AIMAG(x+i-y) = y 

Routines Called 

None 

Type of Argument 

The argument must be a complex value; it can be any such value. 

Type of Result 

The result returned is a single-precision, floating-point value; it is the imagi- 
nary part of the number. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

AIMAG(z) is calculated by copying the imaginary part of the argument to the 
return location. 

Error Conditions 

None 
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CONJ 



Description 

The CONJ routine finds the conjugate of a complex number. That is: 

CONJ(z) = conj(x+i*y) = x-i»y 

Routines Called 

None 

Type of Argument 

The argument must be a complex value; it can be any such value. 

Type of Result 

The result returned is a complex value; it is the conjugate of the argument 
value. 

Accuracy of Result 

The result is exact. 

Algorithm Used 

CONJ(z) is calculated as follows. 

Let z = x+i # y 

conj(x+i»y) = x+(-i*y) 
CONJ(z) = x-i-y 

Error Conditions 

None 



Miscellaneous Complex Routines 15-5 



CFM 



Description 

The CFM subroutine finds the complex, single-precision, floating-point prod- 
uct of two complex, single-precision, floating-point values. That is: 

CFM(z,g) = z-g 

Routines Called 

CFM calls the MTHERR routine. 

Type of Arguments 

CFM is a subroutine with two arguments; both must be complex, single- 
precision, floating-point values. They can be any such values. 

Type of Result 

The result returned is a complex, single-precision, floating-point value. 



Accuracy of Result 



test interval: 

MRE: 
RMS: 



-10000. through 10000. for z (real) 
-10000. through 10000. for z (imaginary) 
-10000. through 10000. for g (real) 
-10000. through 10000. for g (imaginary) 

1.20xl(T 6 (16.4 bits) real 
1.47xl0" 6 (19.4 bits) imaginary 

2.64xl0" 7 (21.9 bits) real 
5.81xl0" 8 (24.0 bits) imaginary 



-4 + -3 -2 -1 +1 +2 +3 +4 + 
LSB error distribution: 2% 1% 1% 14% 64% 15% 1% 1% 2% real 

1% 1% 1% 15% 64% 14% 1% 1% 2% imaginary 

Algorithm Used 

CFM(z,g) is calculated as follows. 

Let z = a+i*b 
Let g = c+i'd 

If CFM(z.g) = (a+i-b)-(c+i-d) 

CFM(z,g) = (a'c-b'd)+i'(b*c+a»d) 

Error Conditions 

1. If either part of the result overflows, the following message is issued and 
that part of the result is set to machine infinity. 

CMATH: Complex overflow 

2. If either part of the result underflows, the following message is issued and 
that part of the result is set to 0.0. 

CMATH: Complex underflow 
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CFDV 



Description 

The CFDV subroutine finds the complex, single-precision, floating-point quo- 
tient of two complex, single-precision, floating-point values. That is: 

CFDV(z,g) = z/g 

Routines Called 

CFDV calls the MTHERR routine. 

Type of Arguments 

CFDV is a subroutine with two arguments; both must be complex, single- 
precision, floating-point values. They can be any such values. 

Type of Result 

The result returned is a complex, single-precision, floating-point value; it may 
be any such value. 

Accuracy of Result 

-10000. through 10000. for z (real) 

. . -10000. through 10000. for z (imaginary) 

test interval: _ 10000 through 10000 for g (real) 

-10000. through 10000. for g (imaginary) 



MRE: 
RMS: 



2.87xl0- 7 (21.7 bits) real 
7.60x10 7 (20.3 bits) imaginary 

1.33xl0- 8 (26.2 bits) real 
2.30xl0" 8 (25.4 bits) imaginary 



-4 + -3 -2 -1 +1 +2 +3 +4 + 
LSB error distribution: 1% 1% 3% 22% 49% 21% 2% 0% 1% real 

1% 1% 3% 21% 50% 20% 3% 1% 1% imaginary 

Algorithm Used 

CFDV(z,g) is calculated as follows. 

Let z = a+i*b 
Let g = c+i # d 

If CFDV(z,g) = (a+i-b)/(c+i-d) 

CFDV(z,g) = ((a-c+b-d)+i-(b-c-a-d))/(c 2 +d 2 ) 

Error Conditions 

1. If either part of the result underflows, the following message is issued and 
that part of the result is set to 0.0. 

CMATH: Complex underflow 

2. If either part of the result overflows, that part of the result is set to 
machine infinity. 



Miscellaneous Complex Routines 15-7 



Appendix A 
ELEFUNT Test Results 



This appendix contains the results of the ELEFUNT tests of W. J. Cody, 
Argonne National Laboratory. For each test, the test interval, maximum rela- 
tive error (MRE), and root mean square (RMS) relative error are given. Note 
that it is not meaningful to compare these test results with the test results 
given for each routine under the heading "Accuracy of Result." 

ACOS(x) vs Taylor Series 

test interval: -1.0000 through -0.7500 
MRE: 0.1231xl0" 7 (26.3 bits) 
RMS: 0.2868xl0~ 8 (28.4 bits) 

ACOS(x) vs Taylor Series 

test interval: 0.7500 through 1.0000 

MRE: 0.1488xl0- 7 (26.0 bits) 
RMS: 0.1330X10 8 (29.5 bits) 

ACOS(x) vs Taylor Series 

test interval: -0.1250 through 0.1250 

MRE: 0.1030X10" 7 (26.5 bits) 
RMS: 0.2647xl0~ 8 (28.5 bits) 

ALOG(x'x) vs 2»log e x 

test interval: 0.1600xl0 2 through 0.2400xl0 3 
MRE: 0.l466xl0~ 7 (26.0 bits) 
RMS: 0.2292xl0 8 (28.7 bits) 

ALOG(x) vs Taylor Series expansion of ALOG(l+y) 

test interval: l-0.1953xl0~ 2 through l+0.1953xl0~ 2 
MRE: 0.2466X10- 7 (25.3 bits) 
RMS: 0.6614X10 8 (27.2 bits) 

ALOG(x) vs ALOG(17x/16)-ALOG(17/16) 

test interval: 0.7071 through 0.9375 

MRE: 0.2264xl0~ 7 (25.4 bits) 
RMS: 0.6426X10 8 (27.2 bits) 
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ALOGlO(x) vs ALOG10(llx/10)-ALOGlO(ll/10) 
test interval: 0.3162 through 0.9000 

MRE: 0.3863x10 7 (24.6 bits) 
RMS: 0.1122X10" 7 (26.4 bits) 

ASIN(x) vs Taylor Series 

test interval: 0.7500 through 1.0000 

MRE: 0.1478xl0~ 7 (26.0 bits) 
RMS: 0.3245X10" 8 (28.2 bits) 

ASIN(x) vs Taylor Series 

test interval: -0.1250 through 0.1250 

MRE: 0.1190X10" 7 (26.3 bits) 
RMS: 0.6733xl0" 9 (30.5 bits) 

ATAN(x) vs truncated Taylor Series 

test interval: -0.6250XKT 1 through 0.6250XKT 1 
MRE: 0.8032X10' 8 (26.9 bits) 
RMS: 0.1796xl0- 9 (32.4 bits) 

ATAN(x) vs ATAN(l/16)+ATAN((x-l/16)/(l+x/16)) 
test interval: 0.6250.10" 1 through 0.2679 
MRE: 0.1488X10" 7 (26.0 bits) 
RMS: 0.6219X10" 8 (27.3 bits) 

2-ATAN(x) vs ATAN(2x/(l-x-x)) 

test interval: 0.2679 through 0.4142 

MRE: 0.1423xl0- 7 (26.1 bits) 
RMS: 0.6597xl(r 8 (27.2 bits) 

2-ATAN(x) vs ATAN(2x/(l-x-x)) 

test interval: 0.4142 through 1.0000 

MRE: 0.1484X10 7 (26.0 bits) 
RMS: 0.3894xlO~ 8 (27.9 bits) 

COS(x) vs 4-COS(x/3) 3 -3-COS(x/3) 

test interval: 0.2199xl0 2 through 0.2356xl0 2 
MRE: 0.2070X10" 7 (25.5 bits) 
RMS: 0.6463xl(T 8 (27.2 bits) 

COSH(x) vs C-(COSH(x+l)+COSH(x-D) 

test interval: 3.0000 through 0.8803xl0 2 
MRE: 0.2219X10" 7 (25.4 bits) 
RMS: 0.7007xl(T 8 (27.1 bits) 

COSH(x) vs Taylor Series expansion of COSH(x) 
test interval: 0.0000 through 0.5000 

MRE: 0.1490xl(T 7 (26.0 bits) 
RMS: 0.5491xl0- 8 (27.4 bits) 

COT(x) vs (COT(x/2) 2 -l)/(2-COT(x/2)) 

test interval: 0.1885xl0 2 through 0.1963xl0 2 
MRE: 0.2975X10 7 (25.0 bits) 
RMS: 0.8629xl0- 8 (26.8 bits) 
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DACOS(x) vs Taylor Series 

test interval: -1.0000 through -0.7500 
MRE: 0.3582X10 18 (61.3 bits) 
RMS: 0.1211xl0- 18 (62.8 bits) 

DACOS(x) vs Taylor Series 

test interval: -0.1250 through -0.1250 
MRE: 0.3000xl0~ 18 (61.5 bits) 
RMS: 0.1224X10 18 (62.8 bits) 

DACOS(x) vs Taylor Series 

test interval: 0.7500 through 1.0000 

MRE: 0.4337xl0 18 (61.0 bits) 
RMS: 0.1682xl0" 18 (62.4 bits) 

DASIN(x) vs Taylor Series 

test interval: -0.1250 through 0.1250 

MRE: 0.4334X10 18 (61.0 bits) 
RMS: 0.1715xl0 18 (62.3 bits) 

DASIN(x) vs Taylor Series 

test interval: 0.7500 through 1.0000 

MRE: 0.4326xl0 18 (61.0 bits) 
RMS: 0.1168x10 18 (62.9 bits) 

DATAN(x) vs truncated Taylor Series 

test interval: -0.6250X10" 1 through -0.6250X10 1 
MRE: 0.4326xl0 18 (61.0 bits) 
RMS: 0.1370X10 18 (62.7 bits) 

DATAN(x) vs DATAN(l/16)+DATAN((x-l/16)/(l+x/16)) 
test interval: 0.6250XKT 1 through 0.2679 
MRE: 0.4333X10 18 (61.0 bits) 
RMS: 0.1755xl0" 18 (62.3 bits) 

2-DATAN(x) vs DATAN(2x/(l-x-x)) 

test interval: 0.2679 through 0.4142 

MRE: 0.6610X10 18 (60.4 bits) 
RMS: 0.1987xl0" 18 (62.1 bits) 

2-DATAN(x) vs DATAN(2x/(l-x-x)) 

test interval: 0.4142 through 1.0000 

MRE: 0.4319xl0 18 (61.0 bits) 
RMS: 0.1167X10" 18 (62.9 bits) 

DCOS(x) vs 4-DCOS(x/3) 3 -3.DCOS(x/3) 

test interval: 0.2199xl0 2 through 0.2356xl0 2 
MRE: 0.6523X10- 18 (60.4 bits) 
RMS: 0.1960X10- 18 (62.2 bits) 

DCOSH(x) vs Taylor Series expansion of DCOSH(x) 
test interval: 0.0000 through 0.5000 

MRE: 0.4337X10- 18 (61.0 bits) 
RMS: 0.1550xl0- 18 (62.5 bits) 
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DCOSH(x) vs C-(DCOSH(x+l)+DCOSH(x-l)) 

test interval: 3.0000 through 0.8803xl0 2 
MRE: 0.8440X10 18 (60.0 bits) 
RMS: 0.2805X10 18 (61.6 bits) 

DCOT(x) vs (DCOT(x/2) 2 -l)/(2vDCOT(x/2)) 

test interval: 0.1885xl0 2 through 0.1963xl0 2 
MRE: 0.9064x10 18 (59.9 bits) 
RMS: 0.2632x10 ,8 (61.7 bits) 

DEXP(x-0.0625) vs DEXP(x)/DEXP(0.0625) 

test interval: -0.2841 through 0.3466 

MRE: 0.4336xl0 18 (61.0 bits) 
RMS: 0.1689xlO~ 18 (62.4 bits) 

DEXP(x-2.8125) vs DEXP(x)/DEXP(2.8125) 

test interval: -3.4660 through -0.4505xl0 2 
MRE: 0.6394X10 18 (60.4 bits) 
RMS: 0.1670X10 18 (62.4 bits) 

DEXP(x-2.8125) vs DEXP(x)/DEXP(2.8125) 

test interval: -6.9310 through 0.8792xl0 2 
MRE: 0.6350X10 18 (60.4 bits) 
RMS: 0.1808x10 18 (62.3 bits) 

DEXP3. (x 10 vs x) 

test interval: 0.5000 through 1.0000 
The result is exact. 

DEXP3. (XSQ 15 vs XSQ-x) 

test interval: 0.5000 through 1.0000 

MRE: 0.4336X10 18 (61.0 bits) 
RMS: 0.1585x10 18 (62.4 bits) 

DEXP3. (XSQ 1 5 vs XSQ-x) 

test interval: 1.0000 through 0.5541xl0 13 
MRE: 0.4330xl0 18 (61.0 bits) 
RMS: 0.1678x10 18 (62.4 bits) 

DEXP3. (x y vs XSQ y/2 ) 

test interval: 0.1000x10 * through O.lOOOxlO 2 for x 
-0.1942xl0 2 through 0.1942xl0 2 for y 
MRE: 0.5499x10 18 (60.7 bits)' 
RMS: 0.1196x10 18 (62.9 bits) 

DLOG(x) vs Taylor Series 

test interval: 1-9537x10 6 through 1+9537X10" 6 



ylor Series expansion of DLOG(l+y) 
interval: 1-9537x10 6 through 1+953' 
MRE: 0.5605x10 18 (60.6 bits) 
RMS: 0.1922x10 18 (62.2 bits) 



DLOG(x) vs DLOG(17x/16)-DLOG(17/16) 

test interval: 0.7071 through 0.9375 

MRE: 0.9228xl0" 18 (59.9 bits) 
RMS: 0.3347xl0 18 (61.4 bits) 
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DLOG(x-x) vs2-DLOG(x) 

test interval: 0.1600xl0 2 through 0.2400xl0 3 
MRE: 0.4306xl0 18 (61.0 bits) 
RMS: 0.7895xl0- 19 (63.5 bits) 

DLOG10(x) vs DLOG10(llx/10)-DLOG10(ll/10) 
test interval: 0.3162 through 0.9000 

MRE: 0.1476xl0~ 17 (59.2 bits) 
RMS: 0.3747X10 18 (61.2 bits) 

DSIN(x) vs 3-DSIN(x/3)-4-DSIN(x/3) 3 

test interval: 0.0000 through 1.5710 

MRE: 0.5378X10 18 (60.7 bits) 
RMS: 0.1802X10 18 (62.3 bits) 

DSIN(x) vs 3-DSIN(x/3)-4-DSIN(x/3) 3 

test interval: 0.1885xl0 2 through 0.2042xl0 2 
MRE: 0.6115X10" 18 (60.5 bits) 
RMS: 0.1960X10- 18 (62.2 bits) 

DSINH(x) vs Taylor Series expansion of DSINH(x) 
test interval: 0.0000 through 0.5000 

MRE: 0.4336xl0~ 18 (61.0 bits) 
RMS: 0.8776X10- 19 (63.3 bits) 

DSINH(x) vs C-(DSINH(x+l)+DSINH(x-l)) 

test interval: 3.0000 through 0.8803xl0 2 
MRE: 0.8643xl0~ 18 (60.0 bits) 
RMS: 0.2736xl0~ 18 (61.7 bits) 

DSQRT(x«x)-x 

test interval: 0.7071 through 1.0000 

MRE: 0.3064xl0 18 (61.5 bits) 
RMS: 0.7383X10- 19 (63.6 bits) 

DSQRT(x.x)-x 

test interval: 1.0000 through 1.4140 
The result is exact. 

DTAN(x) vs 2«TAN(x/2)/(l-DTAN(x/2) 2 ) 

test interval: 0.1885xl0 2 through 0.1963xl0 2 
MRE: 0.1262X10' 17 (59.5 bits) 
RMS: 0.3402X10- 18 (61.4 bits) 

DTAN(x) vs 2-DTAN(x/2)/(l-DTAN(x/2) 2 ) 

test interval: 2.7490 through 3.5340 

MRE: Q.1216XHT 17 (59.5 bits) 
RMS: 0.2492xl0~ 18 (61.8 bits) 

DTAN(x) vs 2-DTAN(x/2)/(l-DTAN(x/2) 2 ) 

test interval: 0.0000 through 0.7854 

MRE: 0.1094xl0 17 (59.7 bits) 
RMS: 0.3331xl0 18 (61.4 bits) 
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DTANH(x) vs (DTANH(x-l/8)+DTANH(l/8))/(l+DTANH(x-l/8)DTANH(l/8)) 
test interval: 0.1250 through 0.5493 

MRE: 0.8436xl0 18 (60.0 bits) 
RMS: 0.2150x10 18 (62.0 bits) 

DTANH(x) vs (DTANH(x-l/8)+DTANH(l/8))/(l+DTANH(x-l/8)DTANH(l/8)) 
test interval: 0.6743 through 0.2253xl0 2 
MRE: 0.4952x10 l8 (60.8 bits) 
RMS: 0.1966X10 18 (62.1 bits) 

EXP(x-0.0625) vs EXP(x)/EXP(0.0625) 

test interval: -0.2841 through 0.3466 

MRE: 0.1489xl0~ 7 (26.0 bits) 
RMS: 0.5801xl0" 8 (27.4 bits) 

EXP(x-2.8125) vs EXP(x)/EXP(2.8125) 

test interval: -3.4660 through -0.6931xl0 2 
MRE: 0.1489x10 7 (26.0 bits) 
RMS: 0.5879X10" 8 (27.3 bits) 

EXP(x-2.8125) vs EXP(x)/EXP(2.8125) 

test interval: 6.9310 through 0.8792xl0 2 
MRE: 0.2108xl0- 7 (25.5 bits) 
RMS: 0.5768xl0~ 8 (27.4 bits) 

EXP3. (x 1 ° vs x) 

test interval: 0.5000 through 1.0000 
The result is exact. 

EXP3. (XSQ 15 vsXSQ-x) 

test interval: 0.5000 through 1.0000 

MRE: 0.1487x10 7 (26.0 bits) 
RMS: 0.5433x10 8 (27.5 bits) 

EXP3. (XSQ 1 5 vs XSQ-x) 

test interval: 1.0000 through 0.5541xl0 13 
MRE: 0.1461xl0" 7 (26.0 bits) 
RMS: 0.5347X10 8 (27.5 bits) 

EXP3. (x y vs XSQ y/2 ) 

test interval: O.l.OOOxlO 1 through 0.1000xl0 2 for x 
-0.1942xl0 2 through 0.1942xl0 2 for y 
MRE: 0.2065x10 7 (25.5 bits) 
RMS: 0.3572xl0 8 (28.0 bits) 

GACOS(x) vs Taylor Series 

test interval: -1.0000 through -0.7500 
MRE: 0.2869X10" 17 (58.3 bits) 
RMS: 0.1515X10 17 (59.2 bits) 

GACOS(x) vs Taylor Series 

test interval: 0.7500 through 1.0000 

MRE: 0.3443x10 17 (58.0 bits) 
RMS: 0.4924x10 18 (60.8 bits) 



A-6 TOPS-10/TOPS-20 Common Math Library Reference Manual 



GACOS(x) vs Taylor Series 

test interval: -0.1250 through 0.1250 

MRE: 0.2399xl0~ 17 (58.5 bits) 
RMS: 0.1297X10 17 (59.4 bits) 

GASIN(x) vs Taylor Series 

test interval: 0.7500 through 1.0000 

MRE: 0.3457X10" 17 (58.0 bits) 
RMS: 0.1452xl0 17 (59.3 bits) 

GASIN(x) vs Taylor Series 

test interval: -0.1250 through 0.1250 
MRE: 0.3462X10 17 (58.0 bits) 
RMS: 0.4997xl0 18 (60.8 bits) 

GATAN(x) vs truncated Taylor Series 

test interval: -0.6250xl0 _1 through 0.6250X10 1 
MRE: 0.3389X10 17 (58.0 bits) 
RMS: 0.3674X10 18 (61.2 bits) 

GATAN(x) vs GATAN(l/16)+GATAN((x-l/16)/(l+x/16)) 
test interval: 0.6250X10 1 through 0.2679 
MRE: 0.3899xl0 17 (57.8 bits) 
RMS: 0.1436X10 17 (59.3 bits) 

2-GATAN(x) vs GATAN(2x/(l-x-x)) 

test interval: 0.2679 through 0.4142 

MRE: 0.3308xl0 17 (58.1 bits) 
RMS: 0.1601xl0 17 (59.1 bits) 

2-GATAN(x) vs GATAN(2x/(l-x-x)) 

test interval: 0.4142 through 1.0000 

MRE: 0.4360xl0 17 (57.7 bits) 
RMS: 0.9839X10 18 (59.8 bits) 

GCOS(x) vs 4-GCOS(x/3) 3 -3-GCOS(x/3) 

test interval: 0.2199xl0 2 through 0.2356xl0 2 
MRE: 0.4779X10 17 (57.5 bits) 
RMS: 0.1515xl0 17 (59.2 bits) 

GCOSH(x) vs C.(GCOSH(x+l)+GCOSH(x-l)) 

test interval: 3.0000 through 0.7091xl0 3 
MRE: 0.4770X10 17 (57.5 bits) 
RMS: 0.1712X10 17 (59.0 bits) 

GCOSH(x) vs Taylor Series expansion of GCOSH(x) 
test interval: 0.0000 through 0.5000 

MRE: 0.3469X10 17 (58.0 bits) 
RMS: 0.1234X10 17 (59.5 bits) 

GCOT(x) vs (GCOT(x/2) 2 -l)/(2-GCOT(x/2)) 

test interval: 0.1885xl0 2 through 0.1963xl0 2 
MRE: 0.7609X10 17 (56.9 bits) 
RMS: 0.2096X10 17 (58.7 bits) 



ELEFUNT Test Results A-7 



GEXP(x~2.8125) vs GEXP(x)/GEXP(2.8125) 

test interval: 6.9310 through 0.7090xl0 3 
MRE: 0.4706x10 17 (57.6 bits) 
RMS: 0.1391x10 17 (59.3 bits) 

GEXP(x-2.8125) vs GEXP(x)/GEXP(2.8125) 

test interval: -3.4660 through -0.6682xl0 3 
MRE: 0.4690X10 17 (57.6 bits) 
RMS: 0.1395xl0 17 (59.3 bits) 

GEXP(x-0.0625) vs GEXP(x)/GEXP(0.0625) 

test interval: -0.2841 through 0.3466 
MRE: 0.3469x10 17 (58.0 bits) 
RMS: 0.1384X10 17 (59.3 bits) 

GEXP3. (x 1 ° vs x) 

test interval: 0.5000 through 1.0000 
The result is exact. 

GEXP3. (XSQ 15 vsXSQ-x) 

test interval: 0.5000 through 1.0000 

MRE: 0.3464X10 17 (58.0 bits) 
RMS: 0.1334x10 17 (59.4 bits) 

GEXP3. (XSQ 15 vsXSQ-x) 

test interval: 1.0000 through 0.4479xl0 103 
MRE: 0.3464xl0 17 (58.0 bits) 
RMS: 0.1347xKT 17 (59.4 bits) 

GEXP3. (x y vs XSQ y/2 ) 

test interval: 1.0000 through O.lOOOxlO 2 for x 

-0.1543x10 s through 0.1543xl0 3 for y 
MRE: 0.3371x10 16 (54.7 bits) 
RMS: 0.4759X10 17 (57.5 bits) 

GLOG(x) vs Taylor Series expansion of GLOG(l+y) 

test interval: l-0.1907xl0~ 5 through l+0.1907xlCT 5 
MRE: 0.5771x10 17 (57.3 bits) 
RMS: 0.1557xl0 17 (59.2 bits) 

GLOG(x) vs GLOG(17x/16)-GLOG(17/16) 

test interval: 0.7071 through 0.9375 

MRE: 0.3501X10- 17 (58.0 bits) 
RMS: 0.1488x10 ~ 17 (59.2 bits) 

GLOG(x-x) vs 2-GLOG(x) 

test interval: 0.1600xl0 2 through 0.2400xl0 3 
MRE: 0.3393x10 17 (58.0 bits) 
RMS: 0.4781xlO J8 (60.9 bits) 

GLOGlO(x) vs GLOGl0(llx/10)~GLOGl0(ll/10) 
test interval: 0.3162 through 0.9000 

MRE: 0.9112X10 17 (56.6 bits) 
RMS: 0.2560X10 17 (58.4 bits) 
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GSIN(x) vs 3-GSIN(x/3)-4-GSIN(x/3) 3 

test interval: 0.0000 through 1.5.710 

MRE: 0.3794xl0" 17 (57.9 bits) 
RMS: 0.1394xl0 17 (59.3 bits) 

GSIN(x) vs 3-GSIN(x/3)-4-GSIN(x/3) 3 

test interval: 0.1885xl0 2 through 0.2042xl0 2 
MRE: 0.5320X10" 17 (57.4 bits) 
RMS: 0.1719X10 17 (59.0 bits) 

GSINH(x) vs C • (GSINH(x+ 1) +GSINH(x-l)) 

test interval: 3.0000 through 0.7091xl0 3 
MRE: 0.5035xl0 17 (57.5 bits) 
RMS: 0.1730X10 17 (59.0 bits) 

GSINH(x) vs Taylor Series expansion of GSINH(x) 
test interval: 0.0000 through 0.5000 

MRE: 0.3459X10" 17 (58.0 bits) 
RMS: 0.2973xl0 18 (61.5 bits) 

GSQRT(x.x)-x 

test interval: 0.7071 through 1.0000 

MRE: 0.2450xl0~ 17 (58.5 bits) 
RMS: 0.6269X10 18 (60.5 bits) 

GSQRT(x-x)-x 

test interval: 1.0000 through 1.4140 
The result is exact. 

GTAN(x) vs 2-GTAN(x/2)/(l-GTAN(x/2) 2 ) 

test interval: 2.7490 through 3.5340 

MRE: 0.6827xl0" 17 (57.0 bits) 
RMS: 0.2028xl0 17 (58.8 bits) 

GTAN(x) vs 2.GTAN(x/2)/(l-GTAN(x/2) 2 ) 

test interval: 0.1885xl0 2 through 0.1963xl0 2 
MRE: 0.9834X10- 17 (56.5 bits) 
RMS: 0.2760X10 17 (58.3 bits) 

GTAN(x) vs 2-GTAN(x/2)/(l-GTAN(x/2) 2 ) 

test interval: 0.0000 through 0.7854 

MRE: 0.9663X10 17 (56.5 bits) 
RMS: 0.2678X10- 17 (58.4 bits) 

GTANH(x) vs (GTANH(x-l/8)+GTANH(l/8))/(l+GTANH(x-l/8)GTANH(l/8)) 
test interval: 0.1250 through 0.5493 

MRE: 0„4684xl0 17 (57.6 bits) 
RMS: 0.1608x10 1? (59.1 bits) 

GTANH(x) vs (GTANH(x-l/8)+GTANH(l/8))/(l+GTANH(x-l/8)GTANH(l/8)) 



test interval 
MRE 
RMS 



0.6743 through 2149xl0 2 
0.3750x10 l7 (57.9 bits) 
0.1621X10 17 (59.1 bits) 
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SIN(x) vs 3 SIN(x/3)-4-SIN(x/3) 3 

test interval: 0.0000 through 1.5710 

MRE: 0.1934xl0~ 7 (25.6 bits) 
RMS: 0.5980x10 8 (27.3 bits) 

SIN(x) vs 3-SIN(x/3)-4-SIN(x/3) 3 

test interval: 0.1885xl0 2 through 0.2042xl0 2 
MRE: 0.2736x10 7 (25.1 bits) 
RMS: 0.6923X10 8 (27.1 bits) 

SINH(x) vs C-(SINH(x+l)+SINH(x-l)) 

test interval: 3.0000 through 0.8803xl0 2 
MRE: 0.3020X10" 7 (25.0 bits) 
RMS: 0.7083x10 8 (27.1 bits) 

SINH(x) vs Taylor Series expansion of SINH(x) 
test interval: 0.0000 through 0.5000 

MRE: 0.1479x10 7 (26.0 bits) 
RMS: 0.1143X10" 8 (29.7 bits) 

SQRT(x-x)-x 

test interval: 0.7071 through 1.0000 
The result is exact. 

SQRT(x-x)-x 

test interval: 1.0000 through 1.4140 
The result is exact. 

TAN(x) vs 2-TAN(x/2)/(l-TAN(x/2) 2 ) 

test interval: 0.1885xl0 2 through 0.1963xl0 2 
MRE: 0.3059x10 7 (25.0 bits) 
RMS: 0.1039xl0~ 7 (26.5 bits) 

TAN(x) vs 2-TAN(x/2)/(l-TAN(x/2) 2 ) 

test interval: 2.7490 through 3.5340 

MRE: 0.2940X10 -7 (25.0 bits) 
RMS: 0.7439x10 8 (27.0 bits) 

TAN(x) vs 2-TAN(x/2)/(l-TAN(x/2) 2 ) 

test interval: 0.0000 through 0.7854 

MRE: 0.2994x10 7 (25.0 bits) 
RMS: 0.1074x10 7 (26.5 bits) 

TANH(x) vs (TANH(x -l/8)+TANH(l/8))/(l+TANH(x-l/8)TANH(l/8)) 
test interval: 0.1250 through 0.5493 

MRE: 0.2020xl0" 7 (25.6 bits) 
RMS: 0.6944xl0 8 (27.1 bits) 

TANH(x) vs (TANH(x-l/8)+TANH(l/8))/(l+TANH(x-l/8)TANH(l/8)) 
test interval: 0.6743 through 0.1040xl0 2 
MRE: 0.2156xl0" 7 (25.5 bits) 
RMS: 0.6360x10 8 (27.2 bits) 



Appendix B 

Using the Common Math Library with MACRO 
Programs 



The Math Library was designed to be used mainly by compiler-level lan- 
guages. The object- time systems of such languages have facilities to handle 
error conditions that may occur when a routine from the Math Library is 
executed. MACRO programmers must include such facilities in their pro- 
grams. 

There are two facilities necessary for use of the Math Library: a trap handler 
and an error handler. The trap handler is needed, since under certain circum- 
stances the Math Library executes floating-point instructions which may 
overflow or underflow. In these cases, the library routines expect that the 
result will be set to the largest possible number for floating overflow, or set to 
zero for underflow. The central processor does not set the results — the over- 
flows and underflows must be detected by the APR trapping system and 
interpreted by the trap handler. If the overflow/underflow settings are not 
done properly, the math routine in question will very likely return mathemati- 
cally incorrect results. 

The error handler is a general error printout routine. It is called by the Math 
Library when the arguments passed to a Math Library routine are out of range 
or otherwise incorrect. 

Provided with the Math Library are modules for handling APR traps and 
properly setting the results (MTHTRP) and for providing error handling and 
reporting (MTHDUM). A MACRO program must initialize these modules 
before using any other components of the Math Library, as follows: 

PUSHJ P,%TRPIN## INITIALIZE TRAP HANDLER 

PUSHJ P,%ERINI## INITIALIZE ERROR HANDLER 



B-1 



Index 



ABS routine, 9-4 
Absolute value 
complex, 9-7 

double-precision D-floating-point, 9-8 
double-precision G-floating-point, 9-9 
double-precision, 

D-floating-point, 9-5 
G-floating-point, 9-6 
integer, 9-3 
single-precision, 9-4 
Accuracy tests, 1-14 
ACOS routine, 6-4 
AIM AG routine, 15-4 
AINT routine, 11-9 
ALOG routine, 3-3 
ALOG10 routine, 3-5 
AMAXO routine, 14-5 
AMAX1 routine, 14-6 
AMINO routine, 14-11 
AMIN1 routine, 14-12 
AMOD routine, 12-6 
ANINT routine, 11-6 
Arc cosine 

double-precision, 

D-floating-point, 6-7 
G-floating-point, 6-11 
single-precision, 6-4 
Arc sine 

double-precision, 

D-floating-point, 6-5 
G-floating-point, 6-9 
single-precision, 6-3 
Arc tangent 

double-precision, 

D-floating-point, 6-17 
G-floating-point, 6-21 
single-precision, 6-13 



ASIN routine, 6-3 
ATAN routine, 6-13 
ATAN2 routine, 6-15 
Average relative error, 1-14 



B 



Base- 10 logarithm, 

double-precision, 

D-floating-point, 3-9 
G-floating-point, 3-13 

single-precision, 3-5 



CABS routine, 9-7 
Calling sequence, 1-13 
CCOS routine, 5-21 
CDABS routine, 9-8 
CDCOS routine, 5-25 
CDEXP routine, 4-11 
CDLOG routine, 3-17 
CDSIN routine, 5-23 
CDSQRT routine, 2-11 
CEXP routine, 4-9 
CEXP2. routine, 4-22 
CEXP3. routine, 4-34 
CFDV routine, 15-7 
CFM routine, 15-6 
CGABS routine, 9-9 
CGCOS routine, 5-29 
CGEXP routine, 4-13 
CGLOG routine, 3-19 
CGSIN routine, 5-27 
CGSQRT routine, 2-13 
CLOG routine, 3-15 
CMPL.C routine, 10-23 
CMPL.D routine, 10-21 
CMPL.G routine, 10-22 
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CMPL.I routine, 10-19 
CMPLX routine, 10-20 
Cody, W. J., 1-15, A-l 
Cody and Waite, Software Manual for 
Elementary Functions, 5-32, 5-34, 
5-36, 5-38, 5-^0 
Complex, 

absolute value, 9-7 
conjugate, 15-5 
conversion, 

complex to complex, 10-23 
cosine, 5-21 
data types, 1-12 
division, 15-7 
double-precision D-floating-point, 1-12 

absolute value, 9-8 

cosine, 5-25 

exponential, 4-11 

natural logarithm, 3-17 

sine, 5-23 

square root, 2-11 
double-precision G-floating-point, 1-12 

absolute value, 9-9 

cosine, 5-29 

exponential, 4-13 

natural logarithm, 3-19 

sine, 5-27 

square root, 2-13 
exponential, 4-9 
exponentiation, 

complex to complex, 4-34 

complex to integer, 4-22 
multiplication, 15-6 
natural logarithm, 3-15 
number, 

imaginary part, 15-4 

real part, 15-3 
product, 15-6 
quotient, 15-7 
sine, 5-19 
square root, 2-9 
Computer Approximations, 

Hart et.al., 3-4, 3-6, 6-14, 6-18, 6-22 
CONJ routine, 15-5 
Conjugate 

complex, 15-5 
Conversion 

complex to complex, 10-23 
double-precision, 

D-floating-point to complex, 10-20 

D-floating-point to G-floating-point, 
10-17, 10-18 

D-floating-point to integer, 10-5 



Conversion (Cont.) 

D-floating-point to single-precision, 10-9 
G-floating-point to complex, 10-22 
G-floating-point to D-floating-point, 

10-13, 10-14 
G-floating-point to integer, 10-6 
G-floating-point to single-precision, 
10^10 
integer, 

to complex, 10-19, 

to double-precision D-floating-point, 

10-11 
to double-precision G-floating-point, 

10-15 
to single-precision, 10-7, 10-8 
single-precision, 
to complex, 10-20 
to double-precision D-floating-point, 

10-12 
to double-precision G-floating-point, 

10-16 
to integer, 10-3, 10-4 
COS routine, 5-7 
COSD routine, 5-9 
COSH routine, 7-4 
Cosine, 

complex, 5—21 

double-precision D-floating-point, 5-25 
double-precision G-floating-point, 5-29 
double-precision, 

D-floating-point, 5-13 
G-floating-point, 5-17 
single-precision, 5-7, 5-9 
COTAN routine, 5-33 
Cotangent, 

double-precision, 

D-floating-point, 5-37 
G-floating-point, 5—41 
single-precision, 5-33 
Coveyan, R. R. and MacPherson, 

R. D., Journal of the ACM, #14, 8-4 
CSIN routine, 5-19 
CSQRT routine, 2-9 



D 



DABS routine, 9-5 
DACOS routine, 6-7 
DASIN routine, 6-5 
DATAN routine, 6-17 
DATAN2 routine, 6-19 
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Data types, 1-10 

complex, 1-12 

double-precision, 

D-floating-point, 1-11 
G-floating-point, 1-11 

integer, 1-10 

single-precision, 1-10 
DBLE routine, 10-12 
DCOS routine, 5-13 
DCOSH routine, 7-7 
DCOTAN routine, 5-37 
DDIM routine, 12-11 
DEXP routine, 4-5 
DEXP2. routine, 4-18 
DEXP3. routine, 4-28 
DFLOAT routine, 10-11 
D-floating-point, 

absolute value, 9-5 

arc cosine, 6-7 

arc sine, 6-5 

arc tangent, 6-17 

base- 10 logarithm, 3-9 

conversion, 

to complex, 10-21 

to G-floating-point, 10-17, 10-18 

to integer, 10-5 

to single-precision, 10-9 

cosine, 5-13 

cotangent, 5-37 

data type, 1-11 

exponential, 4-5 

exponentiation, 

to D-floating-point, 4-28 
to integer, 4-18 

hyperbolic cosine, 7-7 

hyperbolic sine, 7-5 

hyperbolic tangent, 7-12 

maximum of a series, 14-7 

minimum of a series, 14-13 

natural logarithm, 3-7 

polar angle of two points, 6-19 

positive difference, 12-11 

product, 12-3 

remainder, 12-7 

rounding, 

to D-floating-point, 11-7 
to integer, 11-4 

sine, 5-11 

square root, 2-5 

tangent, 5-35 

transfer of sign, 13-5 

truncation, 11-10 
DIM routine, 12-10 



DINT routine, 11-10 
Division, complex, 15-7 
DLOG routine, 3-7 
DLOG10 routine, 3-9 
DMAX1 routine, 14-7 
DMIN1 routine, 14-13 
DMOD routine, 12-7 
DNINT routine, 11-7 
Double precision, 
data types, 1-11 
D-floating-point, 1-11 
absolute value, 9-5 
arc cosine, 6-7 
arc sine, 6-5 
arc tangent, 6-17 
base-10 logarithm, 3-9 
conversion, 

to complex, 10-21 
to G-floating-point, 10-17, 10-18 
to integer, 10-5 
to single-precision, 10-9 
cosine, 5-13 
cotangent, 5-37 
exponential, 4—5 
exponentiation, 

to D-floating-point, 4-28 
to integer, 4-18 
hyperbolic cosine, 7-7 
hyperbolic sine, 7-5 
hyperbolic tangent, 7-12 
maximum of a series, 14-7 
minimum of a series, 14-13 
natural logarithm, 3-7 
polar angle of two points, 6-19 
positive difference, 12-11 
product, 12-3 
remainder, 12-7 
rounding, 

to D-floating-point, 11-7 
to integer, 11-4 
sine, 5-11 
square root, 2-5 
tangent, 5-35 
transfer of sign, 13-5 
truncation, 11-10 
G-floating-point, 1-11 
absolute value, 9-6 
arc cosine, 6-11 
arc sine, 6-9 
arc tangent, 6-21 
base-10 logarithm, 3-13 
conversion, 

to complex, 10-22 
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Double Precision (Cont.) 

to D-floating-point, 10-13, 10-14 

to integer, 10-6 

to single-precision, 10-10 

cosine, 5-17 

cotangent, 5-41 

exponential, 4—7 

exponentiation, 

to G-floating-point, 4-31 
to integer, 4-20 

hyperbolic cosine, 7-10 

hyperbolic sine, 7-8 

hyperbolic tangent, 7-13 

maximum of a series, 14-8 

minimum of a series, 14-14 

natural logarithm, 3-11 

polar angle of two points, 6-23 

positive difference, 12-12 

product, 12-4 

remainder, 12-8 

rounding, 

to G-floating-point, 11-8 
to integer, 11-5 

sine, 5-15 

square root, 2-7 

tangent, 5-39 

transfer of sign, 13-6 

truncation, 11-11 
DPROD routine, 12-3 
DSIGN routine, 13-5 
DSIN routine, 5-11 
DSINH routine, 7-5 
DSQRT routine, 2-5 
DTAN routine, 5-35 
DTANH routine, 7-12 
DTOG routine, 10-17 
DTOGA routine, 10-18 



E 



ELEFUNT tests, 1-15, A-l 
Entry points, 1-13 
Error, 

maximum relative (MRE), 1-14 

average relative (RMS), 1-14 
EXP routine, 4-3 
EXP1. routine, 4-15 
EXP2. routine, 4-16 
EXP3. routine, 4-25 
Exponential, 

complex, 4-9 

double-precision D-floating-point, 4-11 
double-precision G-floating-point, 4-13 



Exponential (Cont.) 
double-precision, 

D-floating-point, 4—5 
G-floating-point, 4—7 
single-precision, 4-3 

Exponentiation, 

complex to complex, 4-34 
complex to integer, 4-22 
D-floating-point to D-floating-point, 4-28 
D-floating-point to integer, 4-18 
G-floating-point to G-floating-point, 4-31 
G-floating-point to integer, 4-20 
integer to integer, 4—15 
single-precision to integer, 4-16 
single-precision to single-precision, 4-25 



FLOAT routine, 10-8 
Functions, 

math library, 1-3 

G 

GABS routine, 9-6 
GACOS routine, 6-11 
GASIN routine, 6-9 
GATAN routine, 6-21 
GATAN2 routine, 6-23 
GCOS routine, 5-17 
GCOSH routine, 7-10 
GCOTAN routine, 5-41 
GDB.n routine, 10-16 
GDIM routine, 12-12 
GEXP routine, 4-7 
GEXP2. routine, 4-20 
GEXP3. routine, 4-31 
GFL.n routine, 10-15 
G-floating-point, 

absolute value, 9-6 

arc cosine, 6-11 

arc sine, 6-9 

arc tangent, 6-21 

base-10 logarithm, 3-13 

conversion, 

to complex, 10-22 

to D-floating-point, 10-13, 10-14 

to integer, 10-6 

to single-precision, 10-10 

cosine, 5-17 

cotangent, 5-41 

data type, 1-11 

exponential, 4-7 



4-lndex 



G-floating-point (Cont.) 
exponentiation, 

to G-floating-point, 4—31 
to integer, 4-20 
hyperbolic cosine, 7-10 
hyperbolic sine, 7-8 
hyperbolic tangent, 7-13 
maximum of a series, 14-8 
minimum of a series, 14-14 
natural logarithm, 3-11 
polar angle of two points, 6-23 
positive difference, 12-12 
product, 12-4 
remainder, 12-8 
rounding, 11-8 

to G-floating-point, 11-8 
to integer, 11-5 
sine, 5-15 
square root, 2-7 
tangent, 5-39 
transfer of sign, 13-6 
truncation, 11-11 

GFX.n routine, 10-6 

GINT. routine, 11-11 

GLOG routine, 3-11 

GLOG10 routine, 3-13 

GMAX1 routine, 14-8 

GMIN1 routine, 14-14 

GMOD routine, 12-8 

GNINT. routine, 11-8 

GPROD. routine, 12-4 

GSIGN routine, 13-6 

GSIN routine, 5-15 

GSINH routine, 7-8 

GSN.n routine, 1O-10 

GSQRT routine, 2-7 

GTAN routine, 5-39 

GTANH routine, 7-13 

GTOD routine, 10-13 

GTODA routine, 10-14 



H 



Hyperbolic sine (Cont.) 
single-precision, 7-3 
Hyperbolic tangent, 
double-precision, 

D-floating-point, 7-12 
G-floating-point, 7-13 
single-precision, 7-11 



IABS routine, 9-3 

IDIM routine, 12-9 

IDINT routine, 10-5 

IDNINT routine, 11-4 

IFIX routine, 10-3 

IGNIN. routine, 11-5 

Imaginary part of a complex number, 15-4 

INT routine, 10-4 

Integer, 

absolute value, 9-3 

conversion, 

to complex, 10-19 

to D-floating-point, 10-11 

to G-floating-point, 10-15 

to single-precision, 10-7, 10-8 

data type, 1-10 

exponentiation, 4r-15 

maximum, 14-3, 14-4 

minimum, 14-9, 14^10 

positive difference, 12-9 

remainder, 12-5 

transfer of sign, 13-3 
ISIGN routine, 13-3 



Journal of the ACM, #14, 

Coveyan, R. R. and MacPherson, R. D., 8-4 



K 



Knuth, D. E., Seminumerical Algorithms, 8-4 



Hart et.al., Computer Approximations, 

3-4, 3-6, 6-14, 6-18, 6-22 
Hyperbolic cosine, 
double-precision, 

D-floating-point, 7-7 
G-floating-point, 7-10 
single-precision, 7-4 
Hyperbolic sine, 
double-precision, 

D-floating-point, 7-5 
G-floating-point, 7-8 



Logarithm, see natural logarithm, 

base- 10 logarithm 
LSB (least significant bit) error distribution, 

1-15 



M 



MACRO programs, using the math 
library with, B-l 
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Math library, 

functions, 1-3 

restrictions, 1-8 

with MACRO programs, B-l 
Mathematical names, 1-9 
Mathematical symbols, 1-9 
MAXO routine, 14-3 
MAX1 routine, 14-4 
Maximum of a series, 

double-precision, 

D-floating-point, 14-7 
G-floating-point, 14-8 

integer, 14-3, 14-4 

single-precision, 14-5, 14—6 
Maximum relative error, 1-14 
MINO routine, 14-9 
MINI routine, 14-10 
Minimum of a series, 

double-precision, 

D-floating-point, 14-13 
G-floating-point, 14-14 

integer, 14-9, 14-10 

single-precision, 14—11, 14-12 
MOD routine, 12-5 
MRE (maximum relative error), 1-14 
Multiplication, complex, 15-6 



N 



Names, mathematical, 1-9 
Natural logarithm 
complex, 3-15 

double-precision D-floating-point, 3-17 
double-precision G-floating-point, 3-19 
double-precision, 

D-floating-point, 3-7 
G-floating-point, 3-11 
single-precision, 3-3 
Newton-Raphson method, 2—4, 2-6, 2-8 
NINT routine, 11-3 



Polar angle of two points, 

double-precision , 

D-floating-point, 6-19 
G-fioating-point, 6-23 

single-precision, 6-15 
Positive difference, 

double-precision , 

D-floating-point, 12-11 
G-floating-point, 12-12 

integer, 12-9 

single-precision, 12-10 



Precision, 1-10 
Product, 

complex, 15-6 
double-precision, 

D-floating-point, 12-3 
G-floating-point, 12-4 

Q 

Quotient, complex, 15-7 

R 

RAN routine, &-3 
Random number generator, 8-3 
spectral test with, 8-3 
with shuffling, 8-5 
Random number seed, 
saving, 8-7 
setting, 8-6 
RANS routine, 8-5 
REAL routine, 10-7 
REAL.C routine, 15-3 
Real part of a complex number, 15-3 
Register usage, 1-13 
Relative error 

average (RMS), 1-14 
maximum (MRE), 1-14 
Remainder, 

double-precision, 

D-floating-point, 12-7 
G-floating-point, 12-8 
integer, 12-5 
single-precision, 12-6 
Restrictions, math library, 1-8 
Return location, 1-13 
RMS (root mean square), 1-14 
Root mean square (RMS), 1-14 
Rounding, 

double-precision, 
D-floating-point, 

to D-floating-point, 11-7 
to integer, 11-4 
G-flbating-point, 

to G-floating-point, 11-8 
to Integer, 11-5 
single-precision, 
to integer, 1.1-3 
to single-precision, 11-6 



Saving random number seed, 8-7 
SAVRAN routine, 8-7 
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Seminumerical algorithms, 

Knuth, D. E., 8-4 
SETRAN routine, 8-6 
Setting random number seed, 8-6 
SIGN routine, 13-4 
Sign, transfer, 
double-precision, 

D-floating-point, 13-5 

G-floating-point, 13-6 
integer, 13-3 
single-precision, 13-4 
SIN routine, 5-3 
SIND routine, 5-5 
Sine, 

complex, 5-19 

double-precision D-floating-point, 5-23 

double-precision G-floating-point, 5-27 
double-precision, 

D-floating-point, 5-11 

G-floating-point, 5-15 
single-precision, 5-3, 5-5 
Single-precision, 
absolute value, 9-4 
arc cosine, 6—4 
arc sine, 6-3 
arc tangent, 6-13 
base-10 logarithm, 3-5 
conversion, 

to complex, 10-20 

to D-floating-point, 10-12 

to G-floating-point, 10-16 

to integer, 10-3, 10-4 
cosine, 5-7, 5-9 
cotangent, 5-33 
data type, 1-10 
exponential, 4-3 
exponentiation, 

to integer, 4-16 

to single-precision, 4-25 
hyperbolic cosine, 7—4 
hyperbolic sine, 7-3 
hyperbolic tangent, 7-11 
maximum of a series, 14—5, 14-6 
minimum of a series, 14-11, 14-12 
natural logarithm, 3-3 
polar angle of two points, 6-15 
positive difference, 12-10 
remainder, 12-6 



Single-precision (Cont.) 
rounding, 

to integer, 11-3 
to single-precision, 11-6 
sine, 5-3, 5-5 
square root, 2-3 
tangent, 5-31 
transfer of sign, 13-4 
truncation, 11-9 
SINH routine, 7-3 
SNGL routine, 10-9 

Software Manual for Elementary Functions, 
Cody and Waite, 5-32, 5-34, 5-36, 5-38, 
5-40 
Spectral test with random number generator, 

8-3 
SQRT routine, 2-3 
Square root, 
complex, 2-9 

double-precision D-floating-point, 2-11 
double-precision G-floating-point, 2-13 
double-precision, 

D-floating-point, 2-5 
G-floating-point, 2-7 
single-precision, 2-3 
Symbols, mathematical, 1-9 



TAN routine, 5-31 
Tangent, 

double-precision, 

D-floating-point, 5-35 
G-floating-point, 5-39 
single-precision, 5-31 
TANH routine, 7-11 
Test interval, 1-14 
Tests, accuracy, 1-14 
Transfer of sign, 
double-precision, 

D-floating-point, 13-5 
G-floating-point, 13-6 
integer, 13-3 
single-precision, 13—4 
Truncation, 

double-precision, 

D-floating-point, 11-10 
G-floating-point, 11-11 
single-precision, 11-9 



lndex-7 



TOPS-10/TOPS-20 

Common Math Library 

Reference Manual 

AA-M400A-TK 

READER'S COMMENTS 

NOTE: This form is for document comments oniy. DIGITAL will use comments submitted on 
this form at the company's discretion. If you require a written reply and are eligible to 
receive one under Software Performance Report (SPR) service, submit your com- 
ments on an SPR form. 

Did you find this manual understandable, usable, and well-organized? Please make sugges- 
tions for improvement. 



Did you find errors in this manual? If so, specify the error and the page number. 



Please indicate the type of reader that you most nearly represent. 

□ Assembly language programmer 

□ Higher-level language programmer 

□ Occasional programmer (experienced) 

□ User with little programming experience 

□ Student programmer 

□ Other (please specify) 



Name Date 



Organization Telephone 

Street 



City State Zip Code 



or Country 
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